From martin at v.loewis.de  Tue Jun  1 00:42:50 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 01 Jun 2010 00:42:50 +0200
Subject: [Python-Dev] _XOPEN_SOURCE on Solaris
Message-ID: <4C043B6A.4070900@v.loewis.de>

In issue 1759169 people have been demanding for quite some time that the 
definition of _XOPEN_SOURCE on Solaris should be dropped, as it was 
unneeded and caused problems for other software.

Now, issue 8864 reports that the multiprocessing module fails to 
compile, and indeed, if _XOPEN_SOURCE is not defined, control messages 
stop working. Several of the CMSG interfaces are only available if 
_XPG4_2 is defined (and, AFAICT, under no other condition); this, in 
turn, apparently is only defined if _XOPEN_SOURCE is 500, 600, or (has 
an arbitrary value and _XOPEN_SOURCE_EXTENDED is 1).

So how should I go about fixing that?
a) revert the patch for #1759169, documentating that Python compilation
    actually requires _XOPEN_SOURCE to be defined, or
b) define _XOPEN_SOURCE only for the multiprocessing module.

Any input appreciated.

Regards,
Martin

From greg.ewing at canterbury.ac.nz  Tue Jun  1 02:33:23 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Tue, 01 Jun 2010 12:33:23 +1200
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <20100531184522.173170@gmx.net>
References: <20100531184522.173170@gmx.net>
Message-ID: <4C045553.70909@canterbury.ac.nz>

smarv at gmx.net wrote:
> Now, the problem is, Python appears to read-access the deallocated memory 
> still after tp_dealloc.

It's not clear exactly what you mean by "after tp_dealloc".
The usual pattern is for a type's tp_dealloc method to call
the base type's tp_dealloc, which can make further references
to the object's memory. At the end of the tp_dealloc chain,
tp_free gets called, which is what actually deallocates the
memory.

I would say your tp_dealloc shouldn't be modifying anything
in the object struct that your corresponding tp_alloc method
didn't set up, because code further along the tp_dealloc
chain may rely on it. That includes fields in the object
header.

-- 
Greg

From pje at telecommunity.com  Tue Jun  1 04:18:02 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 31 May 2010 22:18:02 -0400
Subject: [Python-Dev] Implementing PEP 382, Namespace Packages
In-Reply-To: <AANLkTimGzZ34Gu6cc3-_aSKLtOrqKyXgluiGbhn7vHJm@mail.gmail.c
 om>
References: <20100530074041.2279D3A405F@sparrow.telecommunity.com>
	<AANLkTinjHXpfLtmwpxouqdqJixQSXKzcYVUi0py1a4hT@mail.gmail.com>
	<20100531050328.40B073A402D@sparrow.telecommunity.com>
	<AANLkTimGzZ34Gu6cc3-_aSKLtOrqKyXgluiGbhn7vHJm@mail.gmail.com>
Message-ID: <20100601021806.360AA3A402D@sparrow.telecommunity.com>

At 01:19 PM 5/31/2010 -0700, Brett Cannon wrote:
>But as long as whatever mechanism gets exposed allows people to work
>from a module name that will be enough. The path connection is not
>required as load_module is the end-all-be-all method. If we have a
>similar API added for .pth files that works off of module names then
>those loaders that don't want to work from file paths don't have to.

Right - that's why I suggested that a high-level request like 
get_pth_contents() would give the implementer the most 
flexibility.  Then they don't have to fake a filesystem if they don't 
actually work that way.

For example, a database that maps module names to code objects has no 
need for paths at all, and could just return either ['*'] or None 
depending on whether the package was marked as a namespace package in 
the database...  without needing to fake up the existence of a .pth 
file in a virtual file system.

(Of course, since lots of implementations *do* use filesystem-like 
backends, giving them some utility functions they can use to 
implement the API on top of filesystem operations gives us the best 
of both worlds.) 


From smarv at gmx.net  Tue Jun  1 09:10:20 2010
From: smarv at gmx.net (smarv at gmx.net)
Date: Tue, 01 Jun 2010 09:10:20 +0200
Subject: [Python-Dev] tp_dealloc
Message-ID: <20100601071020.325170@gmx.net>

My tp_dealloc method (of non-subtypable type) calls the freeMem-method 
of a memory manager (this manager was also used for the corresponding allocation). 
This freeMem-method deallocates and modifies the memory, 
which is a valid action, because after free, the memory-manager 
has ownership of the freed memory. 
Several memory managers do this (for example the Memory Manager in 
Delphi during debug mode, in order to track invalid memory access after free).

The python31.dll calls tp_alloc and later (after return of tp-alloc) 
the python31.dll is still awaiting valid content in the deallocated memory. 
I don't know where this happens, I'm not a developer of CPython, 
but at this point the python31.dll causes an access violation. 
IMO the python31.dll assumes that freeMem never modifies the memory 
(pyobject header), this is valid for many memory managers, but not for all. 
And from my perspective, this assumption a bug, which can cause access 
violations in many applications (for example, applications which use the 
PythonForDelphi-package; PyScripter is one of them, but also many others) 

Please, could some CPython-developer take a look, thank you!
-- 
GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From smarv at gmx.net  Tue Jun  1 09:41:12 2010
From: smarv at gmx.net (smarv at gmx.net)
Date: Tue, 01 Jun 2010 09:41:12 +0200
Subject: [Python-Dev] tp_dealloc
Message-ID: <20100601074112.199330@gmx.net>

Sorry, I wrote tp_alloc in last post, it should be always tp_dealloc:

My tp_dealloc method (of non-subtypable type) calls the freeMem-method 
of a memory manager (this manager was also used for the corresponding allocation).
This freeMem-method deallocates and modifies the memory, 
which is a valid action, because after free, the memory-manager 
has ownership of the freed memory. 
Several memory managers do this (for example the Memory Manager in 
Delphi during debug mode, in order to track invalid memory access after free).

The python31.dll calls tp_dealloc and later (after return of tp_dealloc) 
the python31.dll is still awaiting valid content in the deallocated memory. 
I don't know where this happens, I'm not a developer of CPython, 
but at this point the python31.dll causes an access violation. 
IMO the python31.dll assumes that freeMem never modifies the memory 
(pyobject header), this is valid for many memory managers, but not for all. 
And from my perspective, this assumption a bug, which can cause access violations in many applications (for example, applications which use the 
PythonForDelphi-package; PyScripter is one of them, but also many others)

Please, could some CPython-developer take a look, thank you!
-- 
GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From amauryfa at gmail.com  Tue Jun  1 11:52:44 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Tue, 1 Jun 2010 11:52:44 +0200
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <20100601074112.199330@gmx.net>
References: <20100601074112.199330@gmx.net>
Message-ID: <AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>

2010/6/1  <smarv at gmx.net>:
> Sorry, I wrote tp_alloc in last post, it should be always tp_dealloc:
>
> My tp_dealloc method (of non-subtypable type) calls the freeMem-method
> of a memory manager (this manager was also used for the corresponding allocation).
> This freeMem-method deallocates and modifies the memory,
> which is a valid action, because after free, the memory-manager
> has ownership of the freed memory.
> Several memory managers do this (for example the Memory Manager in
> Delphi during debug mode, in order to track invalid memory access after free).
>
> The python31.dll calls tp_dealloc and later (after return of tp_dealloc)
> the python31.dll is still awaiting valid content in the deallocated memory.
> I don't know where this happens, I'm not a developer of CPython,
> but at this point the python31.dll causes an access violation.
> IMO the python31.dll assumes that freeMem never modifies the memory
> (pyobject header), this is valid for many memory managers, but not for all.
> And from my perspective, this assumption a bug, which can cause access violations in many applications (for example, applications which use the
> PythonForDelphi-package; PyScripter is one of them, but also many others)
>
> Please, could some CPython-developer take a look, thank you!

CPython does not access memory after the call to tp_dealloc.
There is even a mode (--without-pymalloc) where tp_dealloc calls
free() at the end,
and would cause crashes if the memory was read afterwards.

This said, there may be a bug somewhere, but what do you want us to look at?
Do you have a case that we could reproduce and investigate?

-- 
Amaury Forgeot d'Arc

From smarv at gmx.net  Tue Jun  1 14:21:57 2010
From: smarv at gmx.net (smarv at gmx.net)
Date: Tue, 01 Jun 2010 14:21:57 +0200
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>
References: <20100601074112.199330@gmx.net>
	<AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>
Message-ID: <20100601122157.225330@gmx.net>

> This said, there may be a bug somewhere, but what do you want us to look
> at?
> Do you have a case that we could reproduce and investigate?
> 
> -- 
> Amaury Forgeot d'Arc

Thank you, I'm not a C-Developer, 
but still I have one more detail:

I call py_decRef( pyObj) of dll (version 3.1.1), 
( which calls tp_dealloc, which calls my freeMem() method))
No problem is reported here.
Now, the freed memory should not be accessed anymore by python31.dll. 
You may fill the freed pyObjectHead with invalid values, 
in my case it's:  ob_refcnt= 7851148, ob_type = $80808080 

But later, when I call Py_Finalize, 
there inside is some access to the same freed memory; 
this causes an AV, more precisely, 
when the value $80808080 is checked.

My Delphi-Debugger shows the following byte-sequence inside python31.dll:
5EC3568B7424088B4604F74054004000007504

5E                  - pop esi
C3                  - ret    
56                  - push esi
8B742408            - mov esi, [esp+$08]
8B4604              - mov eax, [esi+$04]  
       // eax = $80808080 //

F7405400400000      - test [eax+$54], $00004000 
       // AV exception by read of address $808080D4 // 

7504                - jnz $1e03681b


Maybe this can help someone, thank you!

-- 
Marvin

GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From amauryfa at gmail.com  Tue Jun  1 15:00:21 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Tue, 1 Jun 2010 15:00:21 +0200
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <20100601122157.225330@gmx.net>
References: <20100601074112.199330@gmx.net>
	<AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>
	<20100601122157.225330@gmx.net>
Message-ID: <AANLkTinAGxkYoif3vjBaUgAYuiBfrahHS7jGPF6N-X3-@mail.gmail.com>

2010/6/1  <smarv at gmx.net>:
>> This said, there may be a bug somewhere, but what do you want us to look
>> at?
>> Do you have a case that we could reproduce and investigate?
>>
>> --
>> Amaury Forgeot d'Arc
>
> Thank you, I'm not a C-Developer,
> but still I have one more detail:
>
> I call py_decRef( pyObj) of dll (version 3.1.1),
> ( which calls tp_dealloc, which calls my freeMem() method))
> No problem is reported here.
> Now, the freed memory should not be accessed anymore by python31.dll.
> You may fill the freed pyObjectHead with invalid values,
> in my case it's: ?ob_refcnt= 7851148, ob_type = $80808080
>
> But later, when I call Py_Finalize,
> there inside is some access to the same freed memory;
> this causes an AV, more precisely,
> when the value $80808080 is checked.
>
> My Delphi-Debugger shows the following byte-sequence inside python31.dll:
> 5EC3568B7424088B4604F74054004000007504
>
> 5E ? ? ? ? ? ? ? ? ?- pop esi
> C3 ? ? ? ? ? ? ? ? ?- ret
> 56 ? ? ? ? ? ? ? ? ?- push esi
> 8B742408 ? ? ? ? ? ?- mov esi, [esp+$08]
> 8B4604 ? ? ? ? ? ? ?- mov eax, [esi+$04]
> ? ? ? // eax = $80808080 //
>
> F7405400400000 ? ? ?- test [eax+$54], $00004000
> ? ? ? // AV exception by read of address $808080D4 //
>
> 7504 ? ? ? ? ? ? ? ?- jnz $1e03681b
>
>
> Maybe this can help someone, thank you!

I'm sorry but this kind of issue is difficult to investigate without
the source code.
Normally I would compile everything (python & your program) in debug mode,
and try to see why the object is used after tp_dealloc.

For example, it's possible that your code does not handle reference
counts correctly
A call to Py_INCREF() may be missing somewhere, for example. This is a
common error.
tp_dealloc() is called when the reference count falls to zero, but if
the object is still
referenced elsewhere, memory will be accessed again!

Without further information, I cannot consider this as a problem in Python.
I know other extension modules that manage memory in their own way, and work.
It's more probably an issue in the code of your type.

-- 
Amaury Forgeot d'Arc

From smarv at gmx.net  Tue Jun  1 17:42:07 2010
From: smarv at gmx.net (smarv at gmx.net)
Date: Tue, 01 Jun 2010 17:42:07 +0200
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <AANLkTinAGxkYoif3vjBaUgAYuiBfrahHS7jGPF6N-X3-@mail.gmail.com>
References: <20100601074112.199330@gmx.net>
	<AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>
	<20100601122157.225330@gmx.net>
	<AANLkTinAGxkYoif3vjBaUgAYuiBfrahHS7jGPF6N-X3-@mail.gmail.com>
Message-ID: <20100601154207.178500@gmx.net>

> Without further information, I cannot consider this as a problem in
> Python.
> I know other extension modules that manage memory in their own way, and
> work.
> It's more probably an issue in the code of your type.
> 
> -- 
> Amaury Forgeot d'Arc

Ok, thank you, but I'm still hoping, someone could test this. 
I'm very sure, my app is not the cause; 
only the python31.dll (py_finalize) is accessing the freed memory. 
Inside py_finalize there is really no call to my hosting app (or reverse), 
I even tested this in my debugger.

In most applications this python-problem remains hidden, 
because their freeMem() leaves the freed memory unmodified. 
(And that's why very good debuggers modify the freed 
memory to reveal such hidden errors). 
You could simply test this by setting pyObject.ob_type = $80808080 
after freeMem( pyObject). Then later, call py_finalize, 
and you will see the same problem (Access violation by trying 
to use ob_type)
-- 
GRATIS f?r alle GMX-Mitglieder: Die maxdome Movie-FLAT!
Jetzt freischalten unter http://portal.gmx.net/de/go/maxdome01

From amauryfa at gmail.com  Tue Jun  1 18:56:39 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Tue, 1 Jun 2010 18:56:39 +0200
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <20100601154207.178500@gmx.net>
References: <20100601074112.199330@gmx.net>
	<AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>
	<20100601122157.225330@gmx.net>
	<AANLkTinAGxkYoif3vjBaUgAYuiBfrahHS7jGPF6N-X3-@mail.gmail.com>
	<20100601154207.178500@gmx.net>
Message-ID: <AANLkTimUYfviHSoWHn7QcVplvPT4wzovyIvSZneqvnXi@mail.gmail.com>

2010/6/1  <smarv at gmx.net>:
>> Without further information, I cannot consider this as a problem in
>> Python.
>> I know other extension modules that manage memory in their own way, and
>> work.
>> It's more probably an issue in the code of your type.
>>
>> --
>> Amaury Forgeot d'Arc
>
> Ok, thank you, but I'm still hoping, someone could test this.
> I'm very sure, my app is not the cause;
> only the python31.dll (py_finalize) is accessing the freed memory.
> Inside py_finalize there is really no call to my hosting app (or reverse),
> I even tested this in my debugger.

To be clear:
- you did not provide anything for us to test.
- the fact that the crash is inside python31.dll does not indicates a
bug in python.
Consider this (bogus) code:
     FILE *fp = fopen("c:/temp/t", "w");
     free(fp);
This will lead to a crash at program exit (when fcloseall() is called
by the system)
but the issue is really in the code - it should not free(fp).

Without knowing what your code really do, we won't be able to help.

-- 
Amaury Forgeot d'Arc

From ncoghlan at gmail.com  Wed Jun  2 14:33:19 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 02 Jun 2010 22:33:19 +1000
Subject: [Python-Dev] tp_dealloc
In-Reply-To: <20100601122157.225330@gmx.net>
References: <20100601074112.199330@gmx.net>	<AANLkTikXefWKO5mZ_2g68cPI5jYmFAnVZNuCTMpZgvfS@mail.gmail.com>
	<20100601122157.225330@gmx.net>
Message-ID: <4C064F8F.6000508@gmail.com>

On 01/06/10 22:21, smarv at gmx.net wrote:
>> This said, there may be a bug somewhere, but what do you want us to look
>> at?
>> Do you have a case that we could reproduce and investigate?
>>
>> --
>> Amaury Forgeot d'Arc
>
> Thank you, I'm not a C-Developer,
> but still I have one more detail:
>
> I call py_decRef( pyObj) of dll (version 3.1.1),
> ( which calls tp_dealloc, which calls my freeMem() method))
> No problem is reported here.

As Amaury has pointed out, there are a number of ways this could be bug 
in your extension module, or some other CPython extension you are using 
(most obviously, a Py_DECREF without a corresponding Py_INCREF, but 
there are probably other more exotic ways to manage it).

If you corrupt the reference count for a module global variable with an 
extra Py_DECREF call, then you may get an access violation at 
interpreter shutdown (i.e. in response to a Py_Finalize call) as the 
destruction of the module attempts to decrement the reference count of 
an object that was incorrectly deleted while it was still referenced.

Since the symptoms you have described so far *exactly* match the 
expected symptoms of a reference counting bug which may not have 
anything whatsoever to do with the interpreter core or the standard 
library, you're going to need a much better defined test case (written 
in C or Python) to convince us that our code is the problem.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------

From flashk at gmail.com  Wed Jun  2 20:32:55 2010
From: flashk at gmail.com (Farshid Lashkari)
Date: Wed, 2 Jun 2010 11:32:55 -0700
Subject: [Python-Dev] Windows registry path not ignored with
	Py_IgnoreEnvironmentFlag set
Message-ID: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com>

Hello,

I noticed that if Py_IgnoreEnvironmentFlag is enabled, the Windows registry
is still used to initialize sys.path during startup. Is this an oversight or
intentional?

I assumed one of the intentions of this flag is to prevent embedded Python
interpreters from being affected by other Python installations. Ignoring the
Window registry as well as environment variables seems to make sense in this
situation.

If this is an oversight, would it be too late to have this fixed in Python
2.7?

Cheers,
Farshid
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100602/a86ef75d/attachment.html>

From status at bugs.python.org  Fri Jun  4 18:08:50 2010
From: status at bugs.python.org (Python tracker)
Date: Fri,  4 Jun 2010 18:08:50 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20100604160850.026847813C@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2010-05-28 - 2010-06-04)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2727 open (+38) / 17988 closed (+15) / 20715 total (+53)

Open issues with patches:  1103

Average duration of open issues: 719 days.
Median duration of open issues: 504 days.

Open Issues Breakdown
       open  2706 (+38)
languishing    12 ( +0)
    pending     8 ( +0)

Issues Created Or Reopened (58)
_______________________________

Seconds range in time unit                                     2010-06-03
       http://bugs.python.org/issue2568    reopened belopolsky                           
       patch, easy                                                             

26.rc1: test_signal issue on FreeBSD 6.3                       2010-06-03
CLOSED http://bugs.python.org/issue3864    reopened skrah                                
       patch, easy, buildbot                                                   

urllib2 basicauth broken in 2.6.5: RuntimeError: maximum recur 2010-06-04
       http://bugs.python.org/issue8797    reopened orsenthil                            
                                                                               

TZ offset description is unclear in docs                       2010-06-04
       http://bugs.python.org/issue8810    reopened belopolsky                           
       easy, needs review                                                      

truncate() semantics changed in 3.1.2                          2010-05-28
       http://bugs.python.org/issue8840    reopened tjreedy                              
                                                                               

Condition.wait() doesn't raise KeyboardInterrupt               2010-05-28
       http://bugs.python.org/issue8844    created  hobb0001                             
                                                                               

Expose sqlite3 connection inTransaction as read-only in_transa 2010-05-28
CLOSED http://bugs.python.org/issue8845    created  r.david.murray                       
       patch, easy                                                             

cgi.py bug report + fix: tailing carriage return and newline c 2010-05-28
       http://bugs.python.org/issue8846    created  wobsta                               
       patch                                                                   

crash appending list and namedtuple                            2010-05-28
       http://bugs.python.org/issue8847    created  benrg                                
                                                                               

Deprecate or remove "U" and "U#" formats of Py_BuildValue()    2010-05-29
       http://bugs.python.org/issue8848    created  haypo                                
       patch                                                                   

python.exe problem with cvxopt                                 2010-05-29
       http://bugs.python.org/issue8849    created  jroach                               
                                                                               

Remove "w" format of PyParse_ParseTuple()                      2010-05-29
       http://bugs.python.org/issue8850    created  haypo                                
                                                                               

pkgutil document needs more markups                            2010-05-29
       http://bugs.python.org/issue8851    created  mft                                  
       patch                                                                   

_socket fails to build on OpenSolaris x64                      2010-05-29
       http://bugs.python.org/issue8852    created  drkirkby                             
       patch                                                                   

getaddrinfo should accept port of type long                    2010-05-29
       http://bugs.python.org/issue8853    created  AndiDog                              
       patch                                                                   

msvc9compiler.py: find_vcvarsall() doesn't work with VS2008 on 2010-05-29
       http://bugs.python.org/issue8854    created  lemburg                              
       64bit                                                                   

Shelve documentation lacks security warning                    2010-05-30
       http://bugs.python.org/issue8855    created  Longpoke                             
                                                                               

Error in ceval.c when building --without-threads               2010-05-30
CLOSED http://bugs.python.org/issue8856    created  merwok                               
                                                                               

socket.getaddrinfo needs tests                                 2010-05-30
       http://bugs.python.org/issue8857    created  pitrou                               
       patch                                                                   

socket.getaddrinfo returns wrong results for IPv6 addresses    2010-05-30
CLOSED http://bugs.python.org/issue8858    created  pitrou                               
                                                                               

split() splits on non whitespace char when ther is no separato 2010-05-30
CLOSED http://bugs.python.org/issue8859    created  PeterL                               
                                                                               

Rounding in timedelta constructor is inconsistent with that in 2010-05-31
       http://bugs.python.org/issue8860    created  belopolsky                           
       patch                                                                   

curses.wrapper : unnessesary code                              2010-05-31
       http://bugs.python.org/issue8861    created  july                                 
       patch                                                                   

curses.wrapper does not restore terminal if curses.getkey() ge 2010-05-31
       http://bugs.python.org/issue8862    created  july                                 
       patch                                                                   

Segfault handler: display Python backtrace on segfault         2010-05-31
       http://bugs.python.org/issue8863    created  haypo                                
       patch                                                                   

multiprocessing: undefined struct/union member: msg_control    2010-05-31
       http://bugs.python.org/issue8864    created  srid                                 
                                                                               

select.poll is not thread safe                                 2010-05-31
       http://bugs.python.org/issue8865    created  apexo                                
                                                                               

socket.getaddrinfo() should support keyword arguments          2010-05-31
       http://bugs.python.org/issue8866    created  giampaolo.rodola                     
       patch                                                                   

serve.py (using wsgiref) cannot serve Python docs under Python 2010-05-31
       http://bugs.python.org/issue8867    created  r.david.murray                       
                                                                               

Framework install does not behave as a framework               2010-06-01
CLOSED http://bugs.python.org/issue8868    created  mdehoon                              
                                                                               

execfile does not work with UNC paths                          2010-06-01
       http://bugs.python.org/issue8869    created  stier08                              
                                                                               

--user-access-control=force produces invalid installer on Vist 2010-06-01
CLOSED http://bugs.python.org/issue8870    created  techtonik                            
                                                                               

--user-access-control=auto has no effect                       2010-06-01
       http://bugs.python.org/issue8871    created  techtonik                            
                                                                               

if/else stament bug?                                           2010-06-01
CLOSED http://bugs.python.org/issue8872    created  chrits55                             
                                                                               

Popen uses 333 times as much CPU as a shell pipe on Mac OS X   2010-06-01
       http://bugs.python.org/issue8873    created  hughsw                               
                                                                               

py3k documentation mentions deprecated opcode LOAD_LOCALS      2010-06-01
CLOSED http://bugs.python.org/issue8874    created  Yaniv.Aknin                          
                                                                               

XML-RPC improvement is described twice.                        2010-06-02
       http://bugs.python.org/issue8875    created  naoki                                
                                                                               

distutils should not assume that hardlinks will work           2010-06-02
       http://bugs.python.org/issue8876    created  samtygier                            
       patch                                                                   

2to3 fixes stdlib import wrongly                               2010-06-02
CLOSED http://bugs.python.org/issue8877    created  djc                                  
                                                                               

IDLE - str(integer) - TypeError: 'str' object is not callable  2010-06-02
CLOSED http://bugs.python.org/issue8878    created  Stranger381                          
                                                                               

Implement os.link on Windows                                   2010-06-02
       http://bugs.python.org/issue8879    created  brian.curtin                         
                                                                               

ConfigParser.set does not convert non-string values            2010-06-02
CLOSED http://bugs.python.org/issue8880    created  Edwin.Pozharski                      
                                                                               

socket.getaddrinfo() should return named tuples                2010-06-02
       http://bugs.python.org/issue8881    created  giampaolo.rodola                     
                                                                               

socketmodule.c`getsockaddrarg() should not check the	length of 2010-06-03
       http://bugs.python.org/issue8882    created  Edward.Pilatowicz                    
                                                                               

Proxy exception lookup fails on MacOS in urllib.               2010-06-03
       http://bugs.python.org/issue8883    created  yorik.sar                            
       patch                                                                   

Allow binding to local address in http.client                  2010-06-03
CLOSED http://bugs.python.org/issue8884    created  Gaz.Davidson                         
                                                                               

markerbase declaration errors aren't recoverable               2010-06-03
       http://bugs.python.org/issue8885    created  mnot                                 
                                                                               

zipfile.ZipExtFile is a context manager, but that is not docum 2010-06-03
       http://bugs.python.org/issue8886    created  sandberg                             
       patch                                                                   

???pydoc str??? works but not ???pydoc str.translate???        2010-06-03
       http://bugs.python.org/issue8887    created  merwok                               
                                                                               

Promote SafeConfigParser and warn about ConfigParser           2010-06-03
       http://bugs.python.org/issue8888    created  merwok                               
                                                                               

test_support.transient_internet fails on Freebsd because socke 2010-06-03
       http://bugs.python.org/issue8889    created  r.david.murray                       
       patch                                                                   

Modules have dangerous examples in documentation               2010-06-04
       http://bugs.python.org/issue8890    reopened Henri.Salo                           
                                                                               

sort files before archiving for consistency                    2010-06-03
       http://bugs.python.org/issue8891    created  techtonik                            
       patch                                                                   

2to3 fails with assertion failure on "from itertools import *" 2010-06-03
       http://bugs.python.org/issue8892    created  dmalcolm                             
       patch                                                                   

file.{read,readlines} behaviour on Solaris                     2010-06-03
       http://bugs.python.org/issue8893    created  kalt                                 
       patch, needs review                                                     

urllib2 authentication manager retries forever if password is  2010-06-04
CLOSED http://bugs.python.org/issue8894    created  Jurjen                               
                                                                               

newline vs. newlines in io module                              2010-06-04
CLOSED http://bugs.python.org/issue8895    created  jmfauth                              
                                                                               

email.encoders.encode_base64 sets payload to bytes, should set 2010-06-04
CLOSED http://bugs.python.org/issue8896    created  forest_atq                           
       patch                                                                   


Issues Now Closed (39)
______________________

distutils sdist add_defaults does not add data_files            813 days
       http://bugs.python.org/issue2279    merwok                               
                                                                               

Vista UAC/elevation support for bdist_wininst                   785 days
       http://bugs.python.org/issue2581    techtonik                            
       patch, patch                                                            

26.rc1: test_signal issue on FreeBSD 6.3                          0 days
       http://bugs.python.org/issue3864    skrah                                
       patch, easy, buildbot                                                   

Real segmentation fault handler                                 609 days
       http://bugs.python.org/issue3999    haypo                                
       patch                                                                   

Fix complex type to avoid coercion in 2.7.                      474 days
       http://bugs.python.org/issue5211    mark.dickinson                       
       easy                                                                    

datetime.monthdelta                                             453 days
       http://bugs.python.org/issue5434    belopolsky                           
       patch                                                                   

Contradictory documentation for email.mime.text.MIMEText        317 days
       http://bugs.python.org/issue6521    r.david.murray                       
       patch                                                                   

shadows around the io truncate() semantics                      253 days
       http://bugs.python.org/issue6939    ncoghlan                             
       patch                                                                   

Improve explanation of tab expansion in doctests                155 days
       http://bugs.python.org/issue7583    r.david.murray                       
       patch                                                                   

Too narrow platform check in test_datetime                      113 days
       http://bugs.python.org/issue7879    belopolsky                           
       patch, 26backport                                                       

Improve test_os._kill (failing on slow machines)                 44 days
       http://bugs.python.org/issue8405    haypo                                
       patch                                                                   

Test assumptions for test_itimer_virtual and test_itimer_prof    48 days
       http://bugs.python.org/issue8424    skrah                                
       patch, buildbot                                                         

Changes to content of Demo/turtle                                24 days
       http://bugs.python.org/issue8616    georg.brandl                         
       patch                                                                   

test_winsound fails when no playback devices configured          28 days
       http://bugs.python.org/issue8618    brian.curtin                         
       patch                                                                   

2.7 regression in tarfile: IOError: link could not be created    17 days
       http://bugs.python.org/issue8741    lars.gustaebel                       
                                                                               

integer-to-complex comparisons give incorrect results            12 days
       http://bugs.python.org/issue8748    minge                                
       patch                                                                   

urllib.urlencode documentation unclear on doseq                  10 days
       http://bugs.python.org/issue8788    orsenthil                            
                                                                               

IDLE editior not opening                                          3 days
       http://bugs.python.org/issue8829    orsenthil                            
                                                                               

tarfile:  broken hardlink handling and testcase.                  7 days
       http://bugs.python.org/issue8833    lars.gustaebel                       
       patch                                                                   

PyArg_ParseTuple(): remove old and unused "O?" format             1 days
       http://bugs.python.org/issue8837    haypo                                
       patch                                                                   

Expose sqlite3 connection inTransaction as read-only in_transa    3 days
       http://bugs.python.org/issue8845    r.david.murray                       
       patch, easy                                                             

Error in ceval.c when building --without-threads                  0 days
       http://bugs.python.org/issue8856    benjamin.peterson                    
                                                                               

socket.getaddrinfo returns wrong results for IPv6 addresses       1 days
       http://bugs.python.org/issue8858    pitrou                               
                                                                               

split() splits on non whitespace char when ther is no separato    1 days
       http://bugs.python.org/issue8859    PeterL                               
                                                                               

Framework install does not behave as a framework                  1 days
       http://bugs.python.org/issue8868    ronaldoussoren                       
                                                                               

--user-access-control=force produces invalid installer on Vist    1 days
       http://bugs.python.org/issue8870    techtonik                            
                                                                               

if/else stament bug?                                              0 days
       http://bugs.python.org/issue8872    r.david.murray                       
                                                                               

py3k documentation mentions deprecated opcode LOAD_LOCALS         1 days
       http://bugs.python.org/issue8874    benjamin.peterson                    
                                                                               

2to3 fixes stdlib import wrongly                                  0 days
       http://bugs.python.org/issue8877    benjamin.peterson                    
                                                                               

IDLE - str(integer) - TypeError: 'str' object is not callable     0 days
       http://bugs.python.org/issue8878    mark.dickinson                       
                                                                               

ConfigParser.set does not convert non-string values               1 days
       http://bugs.python.org/issue8880    Edwin.Pozharski                      
                                                                               

Allow binding to local address in http.client                     0 days
       http://bugs.python.org/issue8884    loewis                               
                                                                               

urllib2 authentication manager retries forever if password is     0 days
       http://bugs.python.org/issue8894    orsenthil                            
                                                                               

newline vs. newlines in io module                                 0 days
       http://bugs.python.org/issue8895    merwok                               
                                                                               

email.encoders.encode_base64 sets payload to bytes, should set    0 days
       http://bugs.python.org/issue8896    forest_atq                           
       patch                                                                   

timedelta multiply and divide by floating point                1722 days
       http://bugs.python.org/issue1289118 belopolsky                           
       patch                                                                   

unicode in email.MIMEText and email/Charset.py                 1647 days
       http://bugs.python.org/issue1368247 r.david.murray                       
       patch                                                                   

email package and Unicode strings handling                     1360 days
       http://bugs.python.org/issue1555842 r.david.murray                       
                                                                               

improve xrange.__contains__                                    1036 days
       http://bugs.python.org/issue1766304 benjamin.peterson                    
       patch                                                                   


Top Issues Most Discussed (10)
______________________________

 18 multipart/form-data encoding                                     704 days
open        http://bugs.python.org/issue3244   

 16 sort files before archiving for consistency                        1 days
open        http://bugs.python.org/issue8891   

 16 datetime lacks concrete tzinfo impl. for UTC                     492 days
open        http://bugs.python.org/issue5094   

 12 improve xrange.__contains__                                     1036 days
closed      http://bugs.python.org/issue1766304

 10 Modules have dangerous examples in documentation                   0 days
open        http://bugs.python.org/issue8890   

 10 multiprocessing: undefined struct/union member: msg_control        4 days
open        http://bugs.python.org/issue8864   

  9 TZ offset description is unclear in docs                           1 days
open        http://bugs.python.org/issue8810   

  8 Rounding in timedelta constructor is inconsistent with that in     4 days
open        http://bugs.python.org/issue8860   

  8 Expose sqlite3 connection inTransaction as read-only in_transac    3 days
closed      http://bugs.python.org/issue8845   

  7 --user-access-control=force produces invalid installer on Vista    1 days
closed      http://bugs.python.org/issue8870   


From skippy.hammond at gmail.com  Sat Jun  5 01:47:26 2010
From: skippy.hammond at gmail.com (Mark Hammond)
Date: Fri, 04 Jun 2010 16:47:26 -0700
Subject: [Python-Dev] Windows registry path not ignored with
 Py_IgnoreEnvironmentFlag set
In-Reply-To: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com>
References: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com>
Message-ID: <4C09908E.3080706@gmail.com>

On 2/06/2010 11:32 AM, Farshid Lashkari wrote:
> Hello,
>
> I noticed that if Py_IgnoreEnvironmentFlag is enabled, the Windows
> registry is still used to initialize sys.path during startup. Is this an
> oversight or intentional?

I guess it falls somewhere in the middle - the flag refers to the 
'environment' so I believe it hasn't really been considered as applying 
to the registry - IOW, the reference to 'environment' probably refers to 
the specific 'environment variables' rather than the more general 
'execution environment'.

> I assumed one of the intentions of this flag is to prevent embedded
> Python interpreters from being affected by other Python installations.
> Ignoring the Window registry as well as environment variables seems to
> make sense in this situation.

I agree.

> If this is an oversight, would it be too late to have this fixed in
> Python 2.7?

Others will have opinions which carry more weight than mine, but I see 
no reason it should not be fixed for *some* Python version.  Assuming no 
objections from anyone else, I suggest the best way to get this to 
happen in the short to medium term would be to open a bug with a patch. 
  A bug without a patch would also be worthwhile but would almost 
certainly cause it to be pushed back to a future 3.x version...

Cheers,

Mark

From kristjan at ccpgames.com  Sat Jun  5 10:34:19 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Sat, 5 Jun 2010 08:34:19 +0000
Subject: [Python-Dev] ssl
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>

Hello there.
I wanted to do some work on the ssl module, but I was a bit daunted at the prerequisites.  Is there anywhere that I can get at precompiled libs for the openssl that we use?
In general, gettin all those "external" projects seem to be complex to build.  Is there a fast way?

What I want to do, is to implement a separate BIO for OpenSSL, one that calls back into python for writes and reads.  This is so that I can use my own sockets implementation for the actual IO, in particular, I want to funnel the encrypted data through our IOCompletion-based stackless sockets.

If successful, I think this would be a useful addition to ssl.
You would do something like:

class BIO():
  def write(): pass
  def read(): pass

from ssl.import
bio = BIO()
ssl_socket = ssl.wrap_bio(bio, ca_certs=...)


I am new to OpenSSL, I haven't even looked at what a BIO looks like, but I read this:  http://marc.info/?l=openssl-users&m=99909952822335&w=2
which indicates that this ought to be possible.  And before I start experimenting, I need to get my OpenSSL external ready.

Any thoughts?

Kristj?n
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100605/389737a2/attachment.html>

From exarkun at twistedmatrix.com  Sat Jun  5 15:11:09 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Sat, 05 Jun 2010 13:11:09 -0000
Subject: [Python-Dev] ssl
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>
Message-ID: <20100605131109.1708.564335160.divmod.xquotient.15@localhost.localdomain>

On 08:34 am, kristjan at ccpgames.com wrote:
>Hello there.
>I wanted to do some work on the ssl module, but I was a bit daunted at 
>the prerequisites.  Is there anywhere that I can get at precompiled 
>libs for the openssl that we use?
>In general, gettin all those "external" projects seem to be complex to 
>build.  Is there a fast way?

I take it the challenge is that you want to do development on Windows? 
If so, this might help:

  http://www.slproweb.com/products/Win32OpenSSL.html

It's what I use for any Windows pyOpenSSL development I need to do.
>
>What I want to do, is to implement a separate BIO for OpenSSL, one that 
>calls back into python for writes and reads.  This is so that I can use 
>my own sockets implementation for the actual IO, in particular, I want 
>to funnel the encrypted data through our IOCompletion-based stackless 
>sockets.

For what it's worth, Twisted's IOCP SSL support is implemented using 
pyOpenSSL's support of OpenSSL memory BIOs.  This is a little different 
from your idea: memory BIOs are a built-in part of OpenSSL, and just 
give you a buffer from which you can pull whatever bytes OpenSSL wanted 
to write (or a buffer into which to put bytes for OpenSSL to read).

I suspect this would work well enough for your use case.  Being able to 
implement an actual BIO in Python would be pretty cool, though.
>
>If successful, I think this would be a useful addition to ssl.
>You would do something like:
>
>class BIO():
>  def write(): pass
>  def read(): pass
>
>from ssl.import
>bio = BIO()
>ssl_socket = ssl.wrap_bio(bio, ca_certs=...)

Hopefully this would integrate more nicely with the recent work Antoine 
has done with SSL contexts.  The preferred API for creating an SSL 
connection is now more like this:

    import ssl
    ctx = ssl.SSLContext(...)
    conn = ctx.wrap_socket(...)

So perhaps you want to add a wrap_bio method to SSLContext.  In fact, 
this would be the more general API, and could supercede wrap_socket: 
after all, socket support is just implemented with the socket BIOs. 
wrap_socket would become a simple wrapper around something like 
wrap_bio(SocketBIO(socket)).
>
>I am new to OpenSSL, I haven't even looked at what a BIO looks like, 
>but I read this:  http://marc.info/?l=openssl- 
>users&m=99909952822335&w=2
>which indicates that this ought to be possible.  And before I start 
>experimenting, I need to get my OpenSSL external ready.
>
>Any thoughts?

It should be possible.  One thing that's pretty tricky is getting 
threading right, though.  Python doesn't have to deal with this problem 
yet, as far as I know, because it never does something that causes 
OpenSSL to call back into Python code.  Once you have a Python BIO 
implementation, this will clearly be necessary, and you'll have to solve 
this.  It's certainly possible, but quite fiddly.

Jean-Paul

From guido at python.org  Sat Jun  5 16:55:05 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 5 Jun 2010 07:55:05 -0700
Subject: [Python-Dev] Windows registry path not ignored with
	Py_IgnoreEnvironmentFlag set
In-Reply-To: <4C09908E.3080706@gmail.com>
References: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com> 
	<4C09908E.3080706@gmail.com>
Message-ID: <AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>

On Fri, Jun 4, 2010 at 4:47 PM, Mark Hammond <skippy.hammond at gmail.com> wrote:
> On 2/06/2010 11:32 AM, Farshid Lashkari wrote:
>>
>> Hello,
>>
>> I noticed that if Py_IgnoreEnvironmentFlag is enabled, the Windows
>> registry is still used to initialize sys.path during startup. Is this an
>> oversight or intentional?
>
> I guess it falls somewhere in the middle - the flag refers to the
> 'environment' so I believe it hasn't really been considered as applying to
> the registry - IOW, the reference to 'environment' probably refers to the
> specific 'environment variables' rather than the more general 'execution
> environment'.
>
>> I assumed one of the intentions of this flag is to prevent embedded
>> Python interpreters from being affected by other Python installations.
>> Ignoring the Window registry as well as environment variables seems to
>> make sense in this situation.
>
> I agree.
>
>> If this is an oversight, would it be too late to have this fixed in
>> Python 2.7?
>
> Others will have opinions which carry more weight than mine, but I see no
> reason it should not be fixed for *some* Python version. ?Assuming no
> objections from anyone else, I suggest the best way to get this to happen in
> the short to medium term would be to open a bug with a patch. ?A bug without
> a patch would also be worthwhile but would almost certainly cause it to be
> pushed back to a future 3.x version...

I don't object (this had never occurred to me), but is Python on
Windows fully functioning when the registry is entirely ignored?

-- 
--Guido van Rossum (python.org/~guido)

From flashk at gmail.com  Sat Jun  5 20:03:25 2010
From: flashk at gmail.com (Farshid Lashkari)
Date: Sat, 5 Jun 2010 11:03:25 -0700
Subject: [Python-Dev] Windows registry path not ignored with
	Py_IgnoreEnvironmentFlag set
In-Reply-To: <AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
References: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com> 
	<4C09908E.3080706@gmail.com>
	<AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
Message-ID: <AANLkTilA3VgkaoLHky0t1lqx4cRMO0MokPnbmVqH1G0B@mail.gmail.com>

On Sat, Jun 5, 2010 at 7:55 AM, Guido van Rossum <guido at python.org> wrote:
>
> I don't object (this had never occurred to me), but is Python on
> Windows fully functioning when the registry is entirely ignored?


I believe so. The path of executable and Python DLL are used to initialize
sys.path, which should be enough to find the necessary files.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100605/37dfd41b/attachment.html>

From kristjan at ccpgames.com  Sat Jun  5 20:05:07 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Sat, 5 Jun 2010 18:05:07 +0000
Subject: [Python-Dev] Windows registry path not ignored
	with	Py_IgnoreEnvironmentFlag set
In-Reply-To: <AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
References: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com>
	<4C09908E.3080706@gmail.com>
	<AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D533E5D@exchis.ccp.ad.local>

Tangengially relevant is the following:  When embedding python, it is currently impossible (well, in 2.x anyway) to completely override pythons magic path-guessing algorithm.  This is annoying.  Last pycon, the talk on embedding python, showed how applications that do that often get started through bootstrapping batch scripts that set up the environment for python, to guide the path-setting algorithm along.

At CCP, we have patched python so that we can specify an initial sys.path, and completely disable the path guessing algorithm.  This is necessary because python is _embedded_ and it is the embedding application that knows where it is allowed to look for libraries.  This is in addition to telling it to ignore the environment.

In fact, it is my opinion that the path init stuff, as well as command line parsing and so on, really belongs in python.exe and not in python25.lib, although one can argue for the convenience of keeping it in the .lib.  But IMHO, it should not be part of Py_Initialize.

Perhaps I'll submit this particular patch to the tracker one day.

K

> -----Original Message-----
> From: python-dev-bounces+kristjan=ccpgames.com at python.org
> [mailto:python-dev-bounces+kristjan=ccpgames.com at python.org] On Behalf
> Of Guido van Rossum
> Sent: 5. j?n? 2010 14:55
> To: Mark Hammond
> Cc: Python-Dev
> Subject: Re: [Python-Dev] Windows registry path not ignored with
> Py_IgnoreEnvironmentFlag set
> 
> I don't object (this had never occurred to me), but is Python on
> Windows fully functioning when the registry is entirely ignored?
> 
> --
> --Guido van Rossum (python.org/~guido)


From fuzzyman at voidspace.org.uk  Sat Jun  5 20:32:39 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 05 Jun 2010 19:32:39 +0100
Subject: [Python-Dev] Windows registry path not ignored
 with	Py_IgnoreEnvironmentFlag set
In-Reply-To: <AANLkTilA3VgkaoLHky0t1lqx4cRMO0MokPnbmVqH1G0B@mail.gmail.com>
References: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com>
	<4C09908E.3080706@gmail.com>	<AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
	<AANLkTilA3VgkaoLHky0t1lqx4cRMO0MokPnbmVqH1G0B@mail.gmail.com>
Message-ID: <4C0A9847.6000503@voidspace.org.uk>

On 05/06/2010 19:03, Farshid Lashkari wrote:
>
> On Sat, Jun 5, 2010 at 7:55 AM, Guido van Rossum <guido at python.org 
> <mailto:guido at python.org>> wrote:
>
>     I don't object (this had never occurred to me), but is Python on
>     Windows fully functioning when the registry is entirely ignored?
>
>

Yes, it works fine. This is one of the things py2exe does to create 
'standalone' Python programs for Windows.

Michael

> I believe so. The path of executable and Python DLL are used to 
> initialize sys.path, which should be enough to find the necessary files.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100605/64afc101/attachment-0001.html>

From tjreedy at udel.edu  Sat Jun  5 20:51:38 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 05 Jun 2010 14:51:38 -0400
Subject: [Python-Dev] Windows registry path not ignored with
 Py_IgnoreEnvironmentFlag set
In-Reply-To: <AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
References: <AANLkTinhuABXlYDBDHAmD0ocQ9Zn12acZtg3hnp0aF4y@mail.gmail.com>
	<4C09908E.3080706@gmail.com>
	<AANLkTikDOen-4ewZCu9AKHYahQTT5t7ojhUQaRlyoFNv@mail.gmail.com>
Message-ID: <hue6bs$pb4$1@dough.gmane.org>

On 6/5/2010 10:55 AM, Guido van Rossum wrote:

> I don't object (this had never occurred to me), but is Python on
> Windows fully functioning when the registry is entirely ignored?

There have been a couple of portable CPython-on-a-CD or memory stick 
that supposedly run on any machine without 'installation' (writing to 
the registry), so they must run without reading anything Python specific.


From martin at v.loewis.de  Sun Jun  6 01:51:57 2010
From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 06 Jun 2010 01:51:57 +0200
Subject: [Python-Dev] ssl
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>
Message-ID: <4C0AE31D.7030504@v.loewis.de>

> In general, gettin all those ?external? projects seem to be complex to
> build.  Is there a fast way?

Run Tools\buildbot\external.bat.

Regards,
Martin

From benjamin at python.org  Sun Jun  6 04:08:32 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Sat, 5 Jun 2010 21:08:32 -0500
Subject: [Python-Dev] [RELEASE] Python 2.7 release candidate 1 released
Message-ID: <AANLkTikXq6QVgKM5FRyzgfmz0vRoviaRnLrIeZq6P9K1@mail.gmail.com>

On behalf of the Python development team, I'm effusive to announce the first
release candidate of Python 2.7.

Python 2.7 is scheduled (by Guido and Python-dev) to be the last major version
in the 2.x series. However, 2.7 will have an extended period of bugfix
maintenance.

2.7 includes many features that were first released in Python 3.1. The faster io
module, the new nested with statement syntax, improved float repr, set literals,
dictionary views, and the memoryview object have been backported from 3.1. Other
features include an ordered dictionary implementation, unittests improvements, a
new sysconfig module, and support for ttk Tile in Tkinter.  For a more extensive
list of changes in 2.7, see http://doc.python.org/dev/whatsnew/2.7.html or
Misc/NEWS in the Python distribution.

To download Python 2.7 visit:

     http://www.python.org/download/releases/2.7/

While this is a preview release and is thus not suitable for production use, we
strongly encourage Python application and library developers to test the release
with their code and report any bugs they encounter to:

     http://bugs.python.org/

This helps ensure that those upgrading to Python 2.7 will encounter as few bumps
as possible.

2.7 documentation can be found at:

     http://docs.python.org/2.7/


Enjoy!

--
Benjamin Peterson
Release Manager
benjamin at python.org
(on behalf of the entire python-dev team and 2.7's contributors)

From kristjan at ccpgames.com  Mon Jun  7 12:44:40 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 7 Jun 2010 10:44:40 +0000
Subject: [Python-Dev] ssl
In-Reply-To: <4C0AE31D.7030504@v.loewis.de>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>
	<4C0AE31D.7030504@v.loewis.de>
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D533F48@exchis.ccp.ad.local>

Thanks martin.
I did as you suggested, and by installing nasm (creating nasmw.exe as a copy of nasm.exe) and without installing perl, was able to build the 32 bit debug version.
The 64 bit version didn't want to build, probably because of some strangeness in the .vcprops files.
amd64.vcprops defines PythonExe to $(HOST_PYTHON) which isn't defined.
Removing this macro definition makes everything build, right up to the final link:
2>Linking...
2>   Creating library D:\pydev\python\trunk\PCbuild\\amd64\\_ssl_d.lib and object D:\pydev\python\trunk\PCbuild\\amd64\\_ssl_d.exp
2>Creating manifest...
2>.\x64-temp-Debug\_ssl\_ssl.exe.intermediate.manifest : general error c1010070: Failed to load and parse the manifest. El sistema no puede encontrar el archivo especificado.
2>Build log was saved at "file://D:\pydev\python\trunk\PCbuild\x64-temp-Debug\_ssl\BuildLog.htm"
2>_ssl - 1 error(s), 246 warning(s)

The above is using the "trunk", but I got the same result with brances/py3k.

Please don't tell me that I need to install Perl :)
K


> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
> Sent: 5. j?n? 2010 23:52
> To: Kristj?n Valur J?nsson
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] ssl
> 
> > In general, gettin all those "external" projects seem to be complex
> to
> > build.  Is there a fast way?
> 
> Run Tools\buildbot\external.bat.
> 
> Regards,
> Martin


From martin at v.loewis.de  Mon Jun  7 22:33:36 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Mon, 07 Jun 2010 22:33:36 +0200
Subject: [Python-Dev] ssl
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D533F48@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D533E28@exchis.ccp.ad.local>
	<4C0AE31D.7030504@v.loewis.de>
	<930F189C8A437347B80DF2C156F7EC7F0A8D533F48@exchis.ccp.ad.local>
Message-ID: <4C0D57A0.3060309@v.loewis.de>

Am 07.06.2010 12:44, schrieb Kristj?n Valur J?nsson:
> Thanks martin.
> I did as you suggested, and by installing nasm (creating nasmw.exe as a copy of nasm.exe) and without installing perl, was able to build the 32 bit debug version.
> The 64 bit version didn't want to build, probably because of some strangeness in the .vcprops files.
> amd64.vcprops defines PythonExe to $(HOST_PYTHON) which isn't defined.

See PCbuild/readme.txt.

> Please don't tell me that I need to install Perl :)

You don't need to install Perl; see PCbuild/readme.txt.

Regards,
Martin

From kristjan at ccpgames.com  Tue Jun  8 21:58:53 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Tue, 8 Jun 2010 19:58:53 +0000
Subject: [Python-Dev] issue 8832: Add a context manager to dom.minidom
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D73B85D@exchis.ccp.ad.local>

I haven't had any comment on this patch, are there any objections?
http://bugs.python.org/issue8832

K
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100608/0fadbc1c/attachment.html>

From ncoghlan at gmail.com  Tue Jun  8 22:49:01 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 09 Jun 2010 06:49:01 +1000
Subject: [Python-Dev] issue 8832: Add a context manager to dom.minidom
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D73B85D@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73B85D@exchis.ccp.ad.local>
Message-ID: <4C0EACBD.4020202@gmail.com>

On 09/06/10 05:58, Kristj?n Valur J?nsson wrote:
> I haven?t had any comment on this patch, are there any objections?
>
> http://bugs.python.org/issue8832

Sounds good to me. One of the nice things about the context management 
protocol is that it doesn't interfere with any code that isn't 
explicitly written to take advantage of it.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------

From victor.stinner at haypocalc.com  Wed Jun  9 01:53:14 2010
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 9 Jun 2010 01:53:14 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
Message-ID: <201006090153.14190.victor.stinner@haypocalc.com>

There are two opposite issues in the bug tracker:

   #7475: codecs missing: base64 bz2 hex zlib ...
   -> reintroduce the codecs removed from Python3

   #8838: Remove codecs.readbuffer_encode()
   -> remove the last part of the removed codecs

If I understood correctly, the question is: should codecs module only contain 
encoding codecs, or contain also other kind of codecs.

Encoding codec API is now strict (encode: str->bytes, decode: bytes->str), 
it's not possible to reuse str.encode() or bytes.decode() for the other 
codecs. Marc-Andre Lemburg proposed to add .tranform() and .untranform() 
methods to str, bytes and bytearray types. If I understood correctly, it would 
look like:

   >>> b'abc'.transform("hex")
   '616263'
   >>> '616263'.untranform("hex")
   b'abc'

I suppose that each codec will have a different list of accepted input and 
output types. Example:

   bz2: encode:bytes->bytes, decode:bytes->bytes
   rot13: encode:str->str, decode:str->str
   hex: encode:bytes->str, decode: str->bytes

And so "abc".encode("bz2") would raise a TypeError.

--

In my opinion, we should not mix codecs of different kinds (compression, 
cipher, etc.) because the input and output types are different. It would have 
more sense to create a standard API for each kind of codec. Existing examples 
of standard APIs in Python: hashlib, shutil.make_archive(), database API, etc.

-- 
Victor Stinner
http://www.haypocalc.com/

From alexandre at peadrop.com  Wed Jun  9 05:58:10 2010
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Tue, 8 Jun 2010 20:58:10 -0700
Subject: [Python-Dev] Future of 2.x.
Message-ID: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>

Is there is any plan for a 2.8 release? If not, I will go through the
tracker and close outstanding backport requests of 3.x features to
2.x.

-- Alexandre

From benjamin at python.org  Wed Jun  9 06:13:33 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 8 Jun 2010 23:13:33 -0500
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
Message-ID: <AANLkTilKvkcmykeiY7iDGvAxC5nhNrvKRnT7y7OKpU2r@mail.gmail.com>

2010/6/8 Alexandre Vassalotti <alexandre at peadrop.com>:
> Is there is any plan for a 2.8 release? If not, I will go through the
> tracker and close outstanding backport requests of 3.x features to
> 2.x.

Not from the core development team.


-- 
Regards,
Benjamin

From orsenthil at gmail.com  Wed Jun  9 06:30:00 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Wed, 9 Jun 2010 10:00:00 +0530
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
Message-ID: <AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>

On Wed, Jun 9, 2010 at 9:28 AM, Alexandre Vassalotti
<alexandre at peadrop.com> wrote:

> Is there is any plan for a 2.8 release? If not, I will go through the
> tracker and close outstanding backport requests of 3.x features to

You mean, simply mark them as Wont-Fix and close. I doubt, if this is
desirable action to take.
Even thought they are new features, it would still be a good idea to
introduce some of them in minor releases in 2.7. I know, this
deviating from the process, but it could be an option considering that
2.7 is the last of 2.x release. This is just my opinion.

--
Senthil

From fdrake at acm.org  Wed Jun  9 07:15:09 2010
From: fdrake at acm.org (Fred Drake)
Date: Wed, 9 Jun 2010 01:15:09 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com> 
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
Message-ID: <AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>

On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran <orsenthil at gmail.com> wrote:
> it would still be a good idea to
> introduce some of them in minor releases in 2.7. I know, this
> deviating from the process, but it could be an option considering that
> 2.7 is the last of 2.x release.

I disagree.

If there are going to be features going into *any* post 2.7.0 version,
there's no reason not to increment the revision number to 2.8,

Since there's also a well-advertised decision that 2.7 will be the
last 2.x, such a 2.8 isn't planned.  But there's no reason to violate
the no-features-in-bugfix-releases policy.  We've seen violations
cause trouble and confusion, but we've not seen it be successful.

The policy wasn't arbitrary; let's stick to it.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"Chaos is the score upon which reality is written." --Henry Miller

From chrism at plope.com  Wed Jun  9 08:26:28 2010
From: chrism at plope.com (Chris McDonough)
Date: Wed, 09 Jun 2010 02:26:28 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
Message-ID: <1276064788.2227.122.camel@thinko>

On Wed, 2010-06-09 at 01:15 -0400, Fred Drake wrote:
> On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran <orsenthil at gmail.com> wrote:
> > it would still be a good idea to
> > introduce some of them in minor releases in 2.7. I know, this
> > deviating from the process, but it could be an option considering that
> > 2.7 is the last of 2.x release.
> 
> I disagree.
> 
> If there are going to be features going into *any* post 2.7.0 version,
> there's no reason not to increment the revision number to 2.8,
> 
> Since there's also a well-advertised decision that 2.7 will be the
> last 2.x, such a 2.8 isn't planned.  But there's no reason to violate
> the no-features-in-bugfix-releases policy.  We've seen violations
> cause trouble and confusion, but we've not seen it be successful.
> 
> The policy wasn't arbitrary; let's stick to it.

It might be useful to copy the identifiers and URLs of all the backport
request tickets into some other repository, or to create some unique
state in roundup for these.  Rationale: it's almost certain that if the
existing Python core maintainers won't evolve Python 2.X past 2.7, some
other group will, and losing existing context for that would kinda suck.

- C


From stephen at xemacs.org  Wed Jun  9 10:07:17 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 09 Jun 2010 17:07:17 +0900
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <1276064788.2227.122.camel@thinko>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
Message-ID: <871vcghbnu.fsf@uwakimon.sk.tsukuba.ac.jp>

Chris McDonough writes:

 > It might be useful to copy the identifiers and URLs of all the backport
 > request tickets into some other repository, or to create some unique
 > state in roundup for these.

A keyword would do.  Please don't add a status or something like that,
though.


From mal at egenix.com  Wed Jun  9 10:41:29 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 09 Jun 2010 10:41:29 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <201006090153.14190.victor.stinner@haypocalc.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
Message-ID: <4C0F53B9.2020302@egenix.com>

Victor Stinner wrote:
> There are two opposite issues in the bug tracker:
> 
>    #7475: codecs missing: base64 bz2 hex zlib ...
>    -> reintroduce the codecs removed from Python3
> 
>    #8838: Remove codecs.readbuffer_encode()
>    -> remove the last part of the removed codecs
> 
> If I understood correctly, the question is: should codecs module only contain 
> encoding codecs, or contain also other kind of codecs.

Sorry, but I can only repeat what I've already mentioned
a few times on the tracker items: this is a misunderstanding.

The codec system does not mandate a specific type combination
(and that's per design). Only the helper methods .encode() and
.decode() on bytes and str objects in Python3 do in order to
provide type safety.

> Encoding codec API is now strict (encode: str->bytes, decode: bytes->str), 
> it's not possible to reuse str.encode() or bytes.decode() for the other 
> codecs. Marc-Andre Lemburg proposed to add .tranform() and .untranform() 
> methods to str, bytes and bytearray types. If I understood correctly, it would 
> look like:
> 
>    >>> b'abc'.transform("hex")
>    '616263'
>    >>> '616263'.untranform("hex")
>    b'abc'

No, .transform() and .untransform() will be interface to same-type
codecs, i.e. ones that convert bytes to bytes or str to str. As with
.encode()/.decode() these helper methods also implement type safety
of the return type.

The above example will read:

    >>> b'abc'.transform("hex")
    b'616263'
    >>> b'616263'.untranform("hex")
    b'abc'

> I suppose that each codec will have a different list of accepted input and 
> output types. Example:
> 
>    bz2: encode:bytes->bytes, decode:bytes->bytes
>    rot13: encode:str->str, decode:str->str
>    hex: encode:bytes->str, decode: str->bytes

hex will do bytes->bytes in both directions, just like it does
in Python2.

The methods to be used will be .transform() for the encode direction
and .untransform() for the decode direction.

> And so "abc".encode("bz2") would raise a TypeError.

Yes.

> --
> 
> In my opinion, we should not mix codecs of different kinds (compression, 
> cipher, etc.) because the input and output types are different. It would have 
> more sense to create a standard API for each kind of codec. Existing examples 
> of standard APIs in Python: hashlib, shutil.make_archive(), database API, etc.

If you want, you can have those as well, but then you'd
have to introduce new APIs or modules, whereas the codec
interface have existed for quite a while in Python2 and
are in regular use.

For most applications the very simple to use codec interface
to these codecs is all that is needed, so I don't see a strong
case for adding new interfaces, e.g.

hex_data = data.transform('hex')

looks clean and neat.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 09 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                39 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Wed Jun  9 13:14:33 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 09 Jun 2010 21:14:33 +1000
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <4C0F53B9.2020302@egenix.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com>
Message-ID: <4C0F7799.10700@gmail.com>

On 09/06/10 18:41, M.-A. Lemburg wrote:
> The methods to be used will be .transform() for the encode direction
> and .untransform() for the decode direction.

+1, although adding this for 3.2 would need an exception to the 
moratorium approved (since it is adding new methods for builtin types).

Adding the same-type codecs back even without the helper methods should 
be fine though (less useful without the helper methods, obviously, but 
still valid).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------

From solipsis at pitrou.net  Wed Jun  9 13:35:49 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 9 Jun 2010 13:35:49 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com>
Message-ID: <20100609133549.578157ed@pitrou.net>

On Wed, 09 Jun 2010 10:41:29 +0200
"M.-A. Lemburg" <mal at egenix.com> wrote:
> 
> The above example will read:
> 
>     >>> b'abc'.transform("hex")
>     b'616263'
>     >>> b'616263'.untranform("hex")
>     b'abc'

This doesn't look right to me. Hex-encoded "data" is really text (it's
a textual representation of binary, and isn't often used as an opaque
binary transport encoding).
Of course, this is not necessarily so for all codecs. For
base64-encoded data, for example, it is debatable whether you want it
as ASCII bytes or unicode text.


From fuzzyman at voidspace.org.uk  Wed Jun  9 13:38:45 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 09 Jun 2010 12:38:45 +0100
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <20100609133549.578157ed@pitrou.net>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<20100609133549.578157ed@pitrou.net>
Message-ID: <4C0F7D45.4060706@voidspace.org.uk>

On 09/06/2010 12:35, Antoine Pitrou wrote:
> On Wed, 09 Jun 2010 10:41:29 +0200
> "M.-A. Lemburg"<mal at egenix.com>  wrote:
>    
>> The above example will read:
>>
>>      >>>  b'abc'.transform("hex")
>>      b'616263'
>>      >>>  b'616263'.untranform("hex")
>>      b'abc'
>>      
> This doesn't look right to me. Hex-encoded "data" is really text (it's
> a textual representation of binary, and isn't often used as an opaque
> binary transport encoding).
> Of course, this is not necessarily so for all codecs. For
> base64-encoded data, for example, it is debatable whether you want it
> as ASCII bytes or unicode text.
>    

But in both cases you probably want bytes -> bytes and str -> str. If 
you want text out then put text in, if you want bytes out then put bytes in.

Michael

>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From solipsis at pitrou.net  Wed Jun  9 13:40:50 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 09 Jun 2010 13:40:50 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C0F7D45.4060706@voidspace.org.uk>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net>
	<4C0F7D45.4060706@voidspace.org.uk>
Message-ID: <1276083650.3143.1.camel@localhost.localdomain>

Le mercredi 09 juin 2010 ? 12:38 +0100, Michael Foord a ?crit :
> On 09/06/2010 12:35, Antoine Pitrou wrote:
> > On Wed, 09 Jun 2010 10:41:29 +0200
> > "M.-A. Lemburg"<mal at egenix.com>  wrote:
> >    
> >> The above example will read:
> >>
> >>      >>>  b'abc'.transform("hex")
> >>      b'616263'
> >>      >>>  b'616263'.untranform("hex")
> >>      b'abc'
> >>      
> > This doesn't look right to me. Hex-encoded "data" is really text (it's
> > a textual representation of binary, and isn't often used as an opaque
> > binary transport encoding).
> > Of course, this is not necessarily so for all codecs. For
> > base64-encoded data, for example, it is debatable whether you want it
> > as ASCII bytes or unicode text.
> >    
> 
> But in both cases you probably want bytes -> bytes and str -> str. If 
> you want text out then put text in, if you want bytes out then put bytes in.

No, I don't think so. If I'm using hex "encoding", it's because I want
to see a text representation of some arbitrary bytestring (in order to
display it inside another piece of text, for example).
In other words, the purpose of hex is precisely to give a textual
display of non-textual data.


From rdmurray at bitdance.com  Wed Jun  9 13:42:27 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Wed, 09 Jun 2010 07:42:27 -0400
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C0F7799.10700@gmail.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <4C0F7799.10700@gmail.com>
Message-ID: <20100609114228.5059821701A@kimball.webabinitio.net>

On Wed, 09 Jun 2010 21:14:33 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On 09/06/10 18:41, M.-A. Lemburg wrote:
> > The methods to be used will be .transform() for the encode direction
> > and .untransform() for the decode direction.
> 
> +1, although adding this for 3.2 would need an exception to the 
> moratorium approved (since it is adding new methods for builtin types).
> 
> Adding the same-type codecs back even without the helper methods should 
> be fine though (less useful without the helper methods, obviously, but 
> still valid).

Agreed.  And I think making an exception to the moratorium for
translate/untranslate is justified, given that this is restoring a
feature that Python2 had, in a Python3 compatible manner.

--
R. David Murray                                      www.bitdance.com

From mal at egenix.com  Wed Jun  9 13:45:28 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 09 Jun 2010 13:45:28 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <4C0F7799.10700@gmail.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<4C0F7799.10700@gmail.com>
Message-ID: <4C0F7ED8.9000000@egenix.com>

Nick Coghlan wrote:
> On 09/06/10 18:41, M.-A. Lemburg wrote:
>> The methods to be used will be .transform() for the encode direction
>> and .untransform() for the decode direction.
> 
> +1, although adding this for 3.2 would need an exception to the
> moratorium approved (since it is adding new methods for builtin types).

Good point.

We already discussed these methods in 2008 and Guido
approved them back then, so perhaps that's a good argument
for an exception.

> Adding the same-type codecs back even without the helper methods should
> be fine though (less useful without the helper methods, obviously, but
> still valid).

Agreed.

The new methods would make it easier to port to Python3, though,
since e.g. data.encode('hex') is easier to convert to
data.transform('hex').

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 09 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                39 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Wed Jun  9 13:53:08 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 09 Jun 2010 13:53:08 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <20100609133549.578157ed@pitrou.net>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<20100609133549.578157ed@pitrou.net>
Message-ID: <4C0F80A4.8070002@egenix.com>

Antoine Pitrou wrote:
> On Wed, 09 Jun 2010 10:41:29 +0200
> "M.-A. Lemburg" <mal at egenix.com> wrote:
>>
>> The above example will read:
>>
>>     >>> b'abc'.transform("hex")
>>     b'616263'
>>     >>> b'616263'.untranform("hex")
>>     b'abc'
> 
> This doesn't look right to me. Hex-encoded "data" is really text (it's
> a textual representation of binary, and isn't often used as an opaque
> binary transport encoding).

Then we'd need new .encode() and .decode() methods, so that
we could write:

     >>> b'abc'.encode("hex")
     '616263'
     >>> '616263'.decode("hex")
     b'abc'

The reason is that we don't have helper methods for the directions
encoding: bytes->str and
decoding: str->bytes.

We do in Python2, so perhaps adding those back as well would
be a possibility, but I don't want to strain all this too much.

It's always possible to use:

codecs.encode(b'abc')
and
codecs.decode('616263')

instead.

> Of course, this is not necessarily so for all codecs. For
> base64-encoded data, for example, it is debatable whether you want it
> as ASCII bytes or unicode text.

Since there are multiple ways of choosing types, I would like
to use the ones that Python2 already chose, if possible.

The only one I'm not sure about is 'rot13': this is an encoding
that is only defined for text and works by creating mangled
text, so str->str appears to be more correct than str->bytes
(which we have in Python2).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 09 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                39 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From dirkjan at ochtman.nl  Wed Jun  9 13:57:05 2010
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Wed, 9 Jun 2010 13:57:05 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... 	codecs
In-Reply-To: <1276083650.3143.1.camel@localhost.localdomain>
References: <201006090153.14190.victor.stinner@haypocalc.com> 
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net> 
	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
Message-ID: <AANLkTinthUwZTTNuI7Bfsz-oNKsNBhZ68AuBuSwmtvPq@mail.gmail.com>

On Wed, Jun 9, 2010 at 13:40, Antoine Pitrou <solipsis at pitrou.net> wrote:
> No, I don't think so. If I'm using hex "encoding", it's because I want
> to see a text representation of some arbitrary bytestring (in order to
> display it inside another piece of text, for example).
> In other words, the purpose of hex is precisely to give a textual
> display of non-textual data.

Or I want to encode binary data in a non-binary-safe protocol, in
which case I probably want bytes.

Cheers,

Dirkjan

From p.f.moore at gmail.com  Wed Jun  9 13:58:17 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Wed, 9 Jun 2010 12:58:17 +0100
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <1276064788.2227.122.camel@thinko>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
Message-ID: <AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>

On 9 June 2010 07:26, Chris McDonough <chrism at plope.com> wrote:
> On Wed, 2010-06-09 at 01:15 -0400, Fred Drake wrote:
>> On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran <orsenthil at gmail.com> wrote:
>> > it would still be a good idea to
>> > introduce some of them in minor releases in 2.7. I know, this
>> > deviating from the process, but it could be an option considering that
>> > 2.7 is the last of 2.x release.
>>
>> I disagree.
>>
>> If there are going to be features going into *any* post 2.7.0 version,
>> there's no reason not to increment the revision number to 2.8,
>>
>> Since there's also a well-advertised decision that 2.7 will be the
>> last 2.x, such a 2.8 isn't planned. ?But there's no reason to violate
>> the no-features-in-bugfix-releases policy. ?We've seen violations
>> cause trouble and confusion, but we've not seen it be successful.
>>
>> The policy wasn't arbitrary; let's stick to it.
>
> It might be useful to copy the identifiers and URLs of all the backport
> request tickets into some other repository, or to create some unique
> state in roundup for these. ?Rationale: it's almost certain that if the
> existing Python core maintainers won't evolve Python 2.X past 2.7, some
> other group will, and losing existing context for that would kinda suck.

Personally, as a user of Python, I'm already getting tired of the "we
won't let Python 2.x die" arguments. Unless and until some other group
comes along and says they definitely plan to pick up Python 2.x
development (and set up or agree shared usage of all the relevant
infrastructure, bug tracker, developers list, VCS, etc) I see the core
developers' decision as made. 2.7 is the last Python 2.x release, and
all further development will be on 3.x.

On that basis I'm +1 on Alexandre's proposal. A 3rd party planning on
working on a 2.8 release (not that I think such a party currently
exists) can step up and extract the relevant tickets for their later
reference if they feel the need. Let's not stop moving forward for the
convenience of a hypothetical 2.8 development team.

Paul.

From solipsis at pitrou.net  Wed Jun  9 14:17:48 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 9 Jun 2010 14:17:48 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... 	codecs
In-Reply-To: <AANLkTinthUwZTTNuI7Bfsz-oNKsNBhZ68AuBuSwmtvPq@mail.gmail.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net>
	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
	<AANLkTinthUwZTTNuI7Bfsz-oNKsNBhZ68AuBuSwmtvPq@mail.gmail.com>
Message-ID: <20100609141748.733d3e94@pitrou.net>

On Wed, 9 Jun 2010 13:57:05 +0200
Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
> On Wed, Jun 9, 2010 at 13:40, Antoine Pitrou <solipsis at pitrou.net> wrote:
> > No, I don't think so. If I'm using hex "encoding", it's because I want
> > to see a text representation of some arbitrary bytestring (in order to
> > display it inside another piece of text, for example).
> > In other words, the purpose of hex is precisely to give a textual
> > display of non-textual data.
> 
> Or I want to encode binary data in a non-binary-safe protocol, in
> which case I probably want bytes.

In this case you would probably choose a more space-efficient
representation, such as base64 or base85.  Which is why I think the
purpose of hex is mostly for textual representation.

Regards

Antoine.

From victor.stinner at haypocalc.com  Wed Jun  9 14:18:44 2010
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 9 Jun 2010 14:18:44 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C0F53B9.2020302@egenix.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com>
Message-ID: <201006091418.44680.victor.stinner@haypocalc.com>

Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a ?crit :
> No, .transform() and .untransform() will be interface to same-type
> codecs, i.e. ones that convert bytes to bytes or str to str. As with
> .encode()/.decode() these helper methods also implement type safety
> of the return type.

What about buffer compatible objects like array.array(), memoryview(), etc.? 
Should we use codecs.encode() / codecs.decode() for these types?

-- 
Victor Stinner
http://www.haypocalc.com/

From mal at egenix.com  Wed Jun  9 14:34:13 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 09 Jun 2010 14:34:13 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <201006091418.44680.victor.stinner@haypocalc.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<201006091418.44680.victor.stinner@haypocalc.com>
Message-ID: <4C0F8A45.5050500@egenix.com>

Victor Stinner wrote:
> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a ?crit :
>> No, .transform() and .untransform() will be interface to same-type
>> codecs, i.e. ones that convert bytes to bytes or str to str. As with
>> .encode()/.decode() these helper methods also implement type safety
>> of the return type.
> 
> What about buffer compatible objects like array.array(), memoryview(), etc.? 
> Should we use codecs.encode() / codecs.decode() for these types?

Yes, or call the encoders/decoders directly by first fetching
them via codecs.lookup().

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 09 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                39 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Wed Jun  9 14:47:22 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 09 Jun 2010 22:47:22 +1000
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <201006091418.44680.victor.stinner@haypocalc.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<201006091418.44680.victor.stinner@haypocalc.com>
Message-ID: <4C0F8D5A.8010706@gmail.com>

On 09/06/10 22:18, Victor Stinner wrote:
> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a ?crit :
>> No, .transform() and .untransform() will be interface to same-type
>> codecs, i.e. ones that convert bytes to bytes or str to str. As with
>> .encode()/.decode() these helper methods also implement type safety
>> of the return type.
>
> What about buffer compatible objects like array.array(), memoryview(), etc.?
> Should we use codecs.encode() / codecs.decode() for these types?

There are probably enough subtleties that this is all worth specifying 
in a PEP:

- which codecs from 2.x are to be restored
- the domain each codec operates in (binary data or text)*
- review behaviour of codecs.encode and codecs.decode
- behaviour of the new str, bytes and bytearray (un)transform methods
- whether to add helper methods for reverse codecs (like base64)

The PEP would also serve as a reference back to both this discussion and 
the previous one (which was long enough ago that I've forgotten most of it).

*Some are obvious, such as rot13 being text only, and bz2 being binary 
data only, but others are less clear. hex could be either str->str or 
bytes->bytes, since ''.join(map(chr, seq)) and b''.join(map(ord, seq)) 
allow each of them to be implemented trivially in terms of the other. As 
Antoine pointed out, base64 is really a reverse codec (encode from 
bytes->str, decode from str->bytes), so it still wouldn't be covered by 
the new transformation helper methods.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
---------------------------------------------------------------

From facundobatista at gmail.com  Wed Jun  9 14:55:43 2010
From: facundobatista at gmail.com (Facundo Batista)
Date: Wed, 9 Jun 2010 09:55:43 -0300
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
Message-ID: <AANLkTil6Qk-7mkzE8Y_YM6MkyAmYtvQHxEAHxJjLPk8N@mail.gmail.com>

On Wed, Jun 9, 2010 at 8:58 AM, Paul Moore <p.f.moore at gmail.com> wrote:

> On that basis I'm +1 on Alexandre's proposal. A 3rd party planning on
> working on a 2.8 release (not that I think such a party currently
> exists) can step up and extract the relevant tickets for their later
> reference if they feel the need. Let's not stop moving forward for the
> convenience of a hypothetical 2.8 development team.

Yes, closing the tickets as "won't fix" and tagging them as
"will-never-happen-in-2.x" or something, is the best combination of
both worlds: it will clean the tracker and ease further developments,
and will allow anybody to pick up those tickets later.

(I'm +1 too to Alexandre's proposal, btw)

-- 
.    Facundo

Blog: http://www.taniquetil.com.ar/plog/
PyAr: http://www.python.org/ar/

From steve at holdenweb.com  Wed Jun  9 14:56:30 2010
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 09 Jun 2010 20:56:30 +0800
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
Message-ID: <huo31u$fbf$1@dough.gmane.org>

Paul Moore wrote:
> On 9 June 2010 07:26, Chris McDonough <chrism at plope.com> wrote:
>> On Wed, 2010-06-09 at 01:15 -0400, Fred Drake wrote:
>>> On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran <orsenthil at gmail.com> wrote:
>>>> it would still be a good idea to
>>>> introduce some of them in minor releases in 2.7. I know, this
>>>> deviating from the process, but it could be an option considering that
>>>> 2.7 is the last of 2.x release.
>>> I disagree.
>>>
>>> If there are going to be features going into *any* post 2.7.0 version,
>>> there's no reason not to increment the revision number to 2.8,
>>>
>>> Since there's also a well-advertised decision that 2.7 will be the
>>> last 2.x, such a 2.8 isn't planned.  But there's no reason to violate
>>> the no-features-in-bugfix-releases policy.  We've seen violations
>>> cause trouble and confusion, but we've not seen it be successful.
>>>
>>> The policy wasn't arbitrary; let's stick to it.
>> It might be useful to copy the identifiers and URLs of all the backport
>> request tickets into some other repository, or to create some unique
>> state in roundup for these.  Rationale: it's almost certain that if the
>> existing Python core maintainers won't evolve Python 2.X past 2.7, some
>> other group will, and losing existing context for that would kinda suck.
> 
> Personally, as a user of Python, I'm already getting tired of the "we
> won't let Python 2.x die" arguments. Unless and until some other group
> comes along and says they definitely plan to pick up Python 2.x
> development (and set up or agree shared usage of all the relevant
> infrastructure, bug tracker, developers list, VCS, etc) I see the core
> developers' decision as made. 2.7 is the last Python 2.x release, and
> all further development will be on 3.x.
> 
> On that basis I'm +1 on Alexandre's proposal. A 3rd party planning on
> working on a 2.8 release (not that I think such a party currently
> exists) can step up and extract the relevant tickets for their later
> reference if they feel the need. Let's not stop moving forward for the
> convenience of a hypothetical 2.8 development team.
> 
How does throwing away information represent "moving forward"?

I have to say I am surprised by the current lack of momentum behind 3.x,
but I do know users who consider that their current investment in the
2.x series is unlikely to migrate to 3.x in the  next five years, and it
would be strange if they didn't continue to develop 2.x (including
backporting some 3.x features).

I don't see why we have to make such work harder than it need be.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From fuzzyman at voidspace.org.uk  Wed Jun  9 15:05:51 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 09 Jun 2010 14:05:51 +0100
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <huo31u$fbf$1@dough.gmane.org>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
	<huo31u$fbf$1@dough.gmane.org>
Message-ID: <4C0F91AF.1000401@voidspace.org.uk>

On 09/06/2010 13:56, Steve Holden wrote:
> Paul Moore wrote:
>    
>> On 9 June 2010 07:26, Chris McDonough<chrism at plope.com>  wrote:
>>      
>>> On Wed, 2010-06-09 at 01:15 -0400, Fred Drake wrote:
>>>        
>>>> On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran<orsenthil at gmail.com>  wrote:
>>>>          
>>>>> it would still be a good idea to
>>>>> introduce some of them in minor releases in 2.7. I know, this
>>>>> deviating from the process, but it could be an option considering that
>>>>> 2.7 is the last of 2.x release.
>>>>>            
>>>> I disagree.
>>>>
>>>> If there are going to be features going into *any* post 2.7.0 version,
>>>> there's no reason not to increment the revision number to 2.8,
>>>>
>>>> Since there's also a well-advertised decision that 2.7 will be the
>>>> last 2.x, such a 2.8 isn't planned.  But there's no reason to violate
>>>> the no-features-in-bugfix-releases policy.  We've seen violations
>>>> cause trouble and confusion, but we've not seen it be successful.
>>>>
>>>> The policy wasn't arbitrary; let's stick to it.
>>>>          
>>> It might be useful to copy the identifiers and URLs of all the backport
>>> request tickets into some other repository, or to create some unique
>>> state in roundup for these.  Rationale: it's almost certain that if the
>>> existing Python core maintainers won't evolve Python 2.X past 2.7, some
>>> other group will, and losing existing context for that would kinda suck.
>>>        
>> Personally, as a user of Python, I'm already getting tired of the "we
>> won't let Python 2.x die" arguments. Unless and until some other group
>> comes along and says they definitely plan to pick up Python 2.x
>> development (and set up or agree shared usage of all the relevant
>> infrastructure, bug tracker, developers list, VCS, etc) I see the core
>> developers' decision as made. 2.7 is the last Python 2.x release, and
>> all further development will be on 3.x.
>>
>> On that basis I'm +1 on Alexandre's proposal. A 3rd party planning on
>> working on a 2.8 release (not that I think such a party currently
>> exists) can step up and extract the relevant tickets for their later
>> reference if they feel the need. Let's not stop moving forward for the
>> convenience of a hypothetical 2.8 development team.
>>
>>      
> How does throwing away information represent "moving forward"?
>
>    

I'm inclined to agree. There is no *need* to close these tickets now.

> I have to say I am surprised by the current lack of momentum behind 3.x,
> but I do know users who consider that their current investment in the
> 2.x series is unlikely to migrate to 3.x in the  next five years, and it
> would be strange if they didn't continue to develop 2.x (including
> backporting some 3.x features).
>    

Who is the 'they' in your last sentence here? It seems to imply the 
'users'... Certainly no-one specific (neither individual nor group) have 
stepped up and said they will continue to develop Python 2.x. Even if 
they did it is not clear that they would use the python.org 
infrastructure to do it. The Python core developers (basically) *have* 
moved on and are unlikely to further develop 2.x. We'll see though, it's 
all speculation at the moment.

All the best,

Michael

> I don't see why we have to make such work harder than it need be.
>
> regards
>   Steve
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From barry at python.org  Wed Jun  9 16:12:24 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 9 Jun 2010 10:12:24 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
Message-ID: <20100609101224.4425723d@heresy>

On Jun 09, 2010, at 01:15 AM, Fred Drake wrote:

>On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran <orsenthil at gmail.com> wrote:
>> it would still be a good idea to
>> introduce some of them in minor releases in 2.7. I know, this
>> deviating from the process, but it could be an option considering that
>> 2.7 is the last of 2.x release.
>
>I disagree.
>
>If there are going to be features going into *any* post 2.7.0 version,
>there's no reason not to increment the revision number to 2.8,
>
>Since there's also a well-advertised decision that 2.7 will be the
>last 2.x, such a 2.8 isn't planned.  But there's no reason to violate
>the no-features-in-bugfix-releases policy.  We've seen violations
>cause trouble and confusion, but we've not seen it be successful.
>
>The policy wasn't arbitrary; let's stick to it.

I completely agree with Fred.  New features in point releases will cause many
more headaches than opening up a 2.8, which I still hope we don't do.  I'd
rather see all that pent up energy focussed on doing whatever we can to help
people transition to Python 3.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100609/27bde94d/attachment.pgp>

From victor.stinner at haypocalc.com  Wed Jun  9 16:35:38 2010
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Wed, 9 Jun 2010 16:35:38 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C0F8D5A.8010706@gmail.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<201006091418.44680.victor.stinner@haypocalc.com>
	<4C0F8D5A.8010706@gmail.com>
Message-ID: <201006091635.38538.victor.stinner@haypocalc.com>

Le mercredi 09 juin 2010 14:47:22, Nick Coghlan a ?crit :
> *Some are obvious, such as rot13 being text only,

Should rot13 shift any unicode character, or just a-z and A-Z?

Python2 only changes characters a-z and A-Z, and use ISO-8859-1 to encode 
unicode to byte string.

>>> u"abc ?".encode("rot13")
'nop \xe9'
>>> u"abc \u2c01".encode("rot13")
Traceback (most recent call last):
  ...
UnicodeEncodeError: 'charmap' codec can't encode character u'\u2c01' in 
position 4: character maps to <undefined>

-- 
Victor Stinner
http://www.haypocalc.com/

From mal at egenix.com  Wed Jun  9 16:42:28 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 09 Jun 2010 16:42:28 +0200
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <4C0F91AF.1000401@voidspace.org.uk>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>	<huo31u$fbf$1@dough.gmane.org>
	<4C0F91AF.1000401@voidspace.org.uk>
Message-ID: <4C0FA854.1080400@egenix.com>

Michael Foord wrote:
>> How does throwing away information represent "moving forward"?
> 
> I'm inclined to agree. There is no *need* to close these tickets now.
> 
>> I have to say I am surprised by the current lack of momentum behind 3.x,
>> but I do know users who consider that their current investment in the
>> 2.x series is unlikely to migrate to 3.x in the  next five years, and it
>> would be strange if they didn't continue to develop 2.x (including
>> backporting some 3.x features).
>>    
> 
> Who is the 'they' in your last sentence here? It seems to imply the
> 'users'... Certainly no-one specific (neither individual nor group) have
> stepped up and said they will continue to develop Python 2.x. Even if
> they did it is not clear that they would use the python.org
> infrastructure to do it. The Python core developers (basically) *have*
> moved on and are unlikely to further develop 2.x. We'll see though, it's
> all speculation at the moment.

I think it also depends on which core developers you ask :-)

Many of them are not keen on having to maintain Python2 for much
longer, but some of them may have assets codified in Python2
or interests based Python2 that they'll want to keep for
more than just another 5 years.

E.g. we still have customers that are on Python 2.3 and have
just recently considered moving to Python 2.5. Depending on where
you look, motivations are rather diverse.

It's certainly not fair to require all core developers to
continue working on Python2, but it would also be unfair to
cancel out that possibility for a subset of interested devs.
Even more so, since it doesn't really create any extra work
for those that have no interest.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 09 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                39 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From barry at python.org  Wed Jun  9 17:12:38 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 9 Jun 2010 11:12:38 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <4C0FA854.1080400@egenix.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
	<huo31u$fbf$1@dough.gmane.org> <4C0F91AF.1000401@voidspace.org.uk>
	<4C0FA854.1080400@egenix.com>
Message-ID: <20100609111238.7c017907@heresy>

On Jun 09, 2010, at 04:42 PM, M.-A. Lemburg wrote:

>Many of them are not keen on having to maintain Python2 for much
>longer, but some of them may have assets codified in Python2
>or interests based Python2 that they'll want to keep for
>more than just another 5 years.
>
>E.g. we still have customers that are on Python 2.3 and have
>just recently considered moving to Python 2.5. Depending on where
>you look, motivations are rather diverse.
>
>It's certainly not fair to require all core developers to
>continue working on Python2, but it would also be unfair to
>cancel out that possibility for a subset of interested devs.
>Even more so, since it doesn't really create any extra work
>for those that have no interest.

Note that Python 2.7 will be *maintained* for a very long time, which should
satisfy those folks who still require Python 2.  Anybody on older (and
currently unmaintained) versions of Python 2 will not care about new features
so a Python 2.8 wouldn't help them anyway.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100609/0f15bf3f/attachment.pgp>

From janssen at parc.com  Wed Jun  9 18:07:04 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 9 Jun 2010 09:07:04 PDT
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <1276083650.3143.1.camel@localhost.localdomain>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net>
	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
Message-ID: <19213.1276099624@parc.com>

Antoine Pitrou <solipsis at pitrou.net> wrote:

> Le mercredi 09 juin 2010 ? 12:38 +0100, Michael Foord a ?crit :
> > On 09/06/2010 12:35, Antoine Pitrou wrote:
> > > On Wed, 09 Jun 2010 10:41:29 +0200
> > > "M.-A. Lemburg"<mal at egenix.com>  wrote:
> > >    
> > >> The above example will read:
> > >>
> > >>      >>>  b'abc'.transform("hex")
> > >>      b'616263'
> > >>      >>>  b'616263'.untranform("hex")
> > >>      b'abc'
> > >>      
> > > This doesn't look right to me. Hex-encoded "data" is really text (it's
> > > a textual representation of binary, and isn't often used as an opaque
> > > binary transport encoding).
> > > Of course, this is not necessarily so for all codecs. For
> > > base64-encoded data, for example, it is debatable whether you want it
> > > as ASCII bytes or unicode text.
> > >    
> > 
> > But in both cases you probably want bytes -> bytes and str -> str. If 
> > you want text out then put text in, if you want bytes out then put bytes in.
> 
> No, I don't think so. If I'm using hex "encoding", it's because I want
> to see a text representation of some arbitrary bytestring (in order to
> display it inside another piece of text, for example).
> In other words, the purpose of hex is precisely to give a textual
> display of non-textual data.

Yes.  And base64, and quoted-printable, etc.

Bill

From janssen at parc.com  Wed Jun  9 18:13:20 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 9 Jun 2010 09:13:20 PDT
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <20100609111238.7c017907@heresy>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
	<huo31u$fbf$1@dough.gmane.org> <4C0F91AF.1000401@voidspace.org.uk>
	<4C0FA854.1080400@egenix.com> <20100609111238.7c017907@heresy>
Message-ID: <19370.1276100000@parc.com>

Barry Warsaw <barry at python.org> wrote:

> On Jun 09, 2010, at 04:42 PM, M.-A. Lemburg wrote:
> 
> >Many of them are not keen on having to maintain Python2 for much
> >longer, but some of them may have assets codified in Python2
> >or interests based Python2 that they'll want to keep for
> >more than just another 5 years.
> >
> >E.g. we still have customers that are on Python 2.3 and have
> >just recently considered moving to Python 2.5. Depending on where
> >you look, motivations are rather diverse.
> >
> >It's certainly not fair to require all core developers to
> >continue working on Python2, but it would also be unfair to
> >cancel out that possibility for a subset of interested devs.
> >Even more so, since it doesn't really create any extra work
> >for those that have no interest.
> 
> Note that Python 2.7 will be *maintained* for a very long time, which
> should satisfy those folks who still require Python 2.  Anybody on
> older (and currently unmaintained) versions of Python 2 will not care
> about new features so a Python 2.8 wouldn't help them anyway.

There are two kinds of new features, though.  Those added to improve (or
at any rate modify :-) the product, and those added to keep the product
relevant to a changing external world (new operating systems, new
communication protocols, etc.)  I think it would take a pretty strong
crystal ball to be able to rule out the latter kind of feature add from
the 2.x line.

Bill

From barry at python.org  Wed Jun  9 18:32:23 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 9 Jun 2010 12:32:23 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <19370.1276100000@parc.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
	<huo31u$fbf$1@dough.gmane.org> <4C0F91AF.1000401@voidspace.org.uk>
	<4C0FA854.1080400@egenix.com> <20100609111238.7c017907@heresy>
	<19370.1276100000@parc.com>
Message-ID: <20100609123223.27838ab4@heresy>

On Jun 09, 2010, at 09:13 AM, Bill Janssen wrote:

>Barry Warsaw <barry at python.org> wrote:
>
>> Note that Python 2.7 will be *maintained* for a very long time, which
>> should satisfy those folks who still require Python 2.  Anybody on
>> older (and currently unmaintained) versions of Python 2 will not care
>> about new features so a Python 2.8 wouldn't help them anyway.
>
>There are two kinds of new features, though.  Those added to improve (or
>at any rate modify :-) the product, and those added to keep the product
>relevant to a changing external world (new operating systems, new
>communication protocols, etc.)  I think it would take a pretty strong
>crystal ball to be able to rule out the latter kind of feature add from
>the 2.x line.

The latter should mostly be supported by third party packages available in the
Cheeseshop.  To the extent that such support can't be effected by add-ons
(e.g. new OS support), I think a better approach would be to encourage and
allow unofficial ports by utilizing dvcs branches (we *are* moving to
Mercurial after Python 2.7 final is released, right?).

I think we should plan on 2.7 being the last Python 2, and spend lots of effort
to get people onto Python 3, partially by offering big carrots like Unladen
Swallow, a better/no GIL, etc.  I think it should be part of the PSF's mission
to help that happen through directed sponsorship, sprints, and other tools.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100609/cbf44742/attachment.pgp>

From jnoller at gmail.com  Wed Jun  9 19:16:30 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Wed, 9 Jun 2010 13:16:30 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <20100609123223.27838ab4@heresy>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
	<huo31u$fbf$1@dough.gmane.org> <4C0F91AF.1000401@voidspace.org.uk>
	<4C0FA854.1080400@egenix.com> <20100609111238.7c017907@heresy>
	<19370.1276100000@parc.com> <20100609123223.27838ab4@heresy>
Message-ID: <AANLkTik0LwlLK5t_jdq7ti6qwGGxswHT-7f5Xuuon4nH@mail.gmail.com>

On Wed, Jun 9, 2010 at 12:32 PM, Barry Warsaw <barry at python.org> wrote:
> On Jun 09, 2010, at 09:13 AM, Bill Janssen wrote:
>
>>Barry Warsaw <barry at python.org> wrote:
>>
>>> Note that Python 2.7 will be *maintained* for a very long time, which
>>> should satisfy those folks who still require Python 2. ?Anybody on
>>> older (and currently unmaintained) versions of Python 2 will not care
>>> about new features so a Python 2.8 wouldn't help them anyway.
>>
>>There are two kinds of new features, though. ?Those added to improve (or
>>at any rate modify :-) the product, and those added to keep the product
>>relevant to a changing external world (new operating systems, new
>>communication protocols, etc.) ?I think it would take a pretty strong
>>crystal ball to be able to rule out the latter kind of feature add from
>>the 2.x line.
>
> The latter should mostly be supported by third party packages available in the
> Cheeseshop. ?To the extent that such support can't be effected by add-ons
> (e.g. new OS support), I think a better approach would be to encourage and
> allow unofficial ports by utilizing dvcs branches (we *are* moving to
> Mercurial after Python 2.7 final is released, right?).
>
> I think we should plan on 2.7 being the last Python 2, and spend lots of effort
> to get people onto Python 3, partially by offering big carrots like Unladen
> Swallow, a better/no GIL, etc. ?I think it should be part of the PSF's mission
> to help that happen through directed sponsorship, sprints, and other tools.
>
> -Barry

+1 fearless FLUFL

From brett at python.org  Wed Jun  9 19:41:47 2010
From: brett at python.org (Brett Cannon)
Date: Wed, 9 Jun 2010 10:41:47 -0700
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <20100609111238.7c017907@heresy>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com> 
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com> 
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com> 
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com> 
	<huo31u$fbf$1@dough.gmane.org> <4C0F91AF.1000401@voidspace.org.uk> 
	<4C0FA854.1080400@egenix.com> <20100609111238.7c017907@heresy>
Message-ID: <AANLkTikDFDA_BFTg46t5Uv2Psrh1TqPMU9ZTv1rXNjQb@mail.gmail.com>

On Wed, Jun 9, 2010 at 08:12, Barry Warsaw <barry at python.org> wrote:
> On Jun 09, 2010, at 04:42 PM, M.-A. Lemburg wrote:
>
>>Many of them are not keen on having to maintain Python2 for much
>>longer, but some of them may have assets codified in Python2
>>or interests based Python2 that they'll want to keep for
>>more than just another 5 years.
>>
>>E.g. we still have customers that are on Python 2.3 and have
>>just recently considered moving to Python 2.5. Depending on where
>>you look, motivations are rather diverse.
>>
>>It's certainly not fair to require all core developers to
>>continue working on Python2, but it would also be unfair to
>>cancel out that possibility for a subset of interested devs.
>>Even more so, since it doesn't really create any extra work
>>for those that have no interest.
>
> Note that Python 2.7 will be *maintained* for a very long time, which should
> satisfy those folks who still require Python 2. ?Anybody on older (and
> currently unmaintained) versions of Python 2 will not care about new features
> so a Python 2.8 wouldn't help them anyway.

The other point about Alexandre's desire to close the issues is that
nothing is really getting deleted; closed issues can still be searched
for. Alexandre simply wants to not waste anyone's time who happens to
be looking at the tracker with issues that the core team will simply
never work on. If some mythical 2.8 fork of Python comes along they
can perform a search and find the issues that were closed because they
were backports that never happened.

So +1 on closing them out.

From raymond.hettinger at gmail.com  Wed Jun  9 19:45:13 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Wed, 9 Jun 2010 10:45:13 -0700
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTilKvkcmykeiY7iDGvAxC5nhNrvKRnT7y7OKpU2r@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTilKvkcmykeiY7iDGvAxC5nhNrvKRnT7y7OKpU2r@mail.gmail.com>
Message-ID: <F72C2047-A66A-4597-85A8-122ACD68E7F2@gmail.com>


On Jun 8, 2010, at 9:13 PM, Benjamin Peterson wrote:

> 2010/6/8 Alexandre Vassalotti <alexandre at peadrop.com>:
>> Is there is any plan for a 2.8 release? If not, I will go through the
>> tracker and close outstanding backport requests of 3.x features to
>> 2.x.
> 
> Not from the core development team.

The current plan is to make 2.7 the last 2.x release.
The theory is that this will encourage people to switch to 3.x.
In practice, the users will get a say in this and time will tell.

When I do polls at conferences, it seems that most participants
have briefly tried 3.x but are continuing to develop in 2.x.


Raymond
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100609/c8ce6ba8/attachment.html>

From rdmurray at bitdance.com  Wed Jun  9 19:59:20 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Wed, 09 Jun 2010 13:59:20 -0400
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <201006091635.38538.victor.stinner@haypocalc.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<201006091418.44680.victor.stinner@haypocalc.com>
	<4C0F8D5A.8010706@gmail.com>
	<201006091635.38538.victor.stinner@haypocalc.com>
Message-ID: <20100609175920.1B46A21849A@kimball.webabinitio.net>

On Wed, 09 Jun 2010 16:35:38 +0200, Victor Stinner <victor.stinner at haypocalc.com> wrote:
> Le mercredi 09 juin 2010 14:47:22, Nick Coghlan a =E9crit :
> > *Some are obvious, such as rot13 being text only,
> 
> Should rot13 shift any unicode character, or just a-z and A-Z?

The latter, unless you want to do a lot of work:

    http://unicode.org/mail-arch/unicode-ml/y2007-m12/0047.html

--
R. David Murray                                      www.bitdance.com

From tjreedy at udel.edu  Wed Jun  9 21:28:25 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 09 Jun 2010 15:28:25 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <871vcghbnu.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>
	<871vcghbnu.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <huoq0l$dmd$1@dough.gmane.org>

On 6/9/2010 4:07 AM, Stephen J. Turnbull wrote:
> Chris McDonough writes:
>
>   >  It might be useful to copy the identifiers and URLs of all the backport
>   >  request tickets into some other repository, or to create some unique
>   >  state in roundup for these.

Closed issues are not lost. They can still be searched and the result 
downloaded.

> A keyword would do.  Please don't add a status or something like that,
> though.

I believe Type: feature request; Version: 2.7; Resolution wont fix 
should do fine now. I believe Alexander will use the first two to find 
things to close. Anything else anyone finds could be made to match.

Terry Jan Reedy


From tjreedy at udel.edu  Wed Jun  9 21:28:30 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 09 Jun 2010 15:28:30 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <4C0FA854.1080400@egenix.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>	<huo31u$fbf$1@dough.gmane.org>	<4C0F91AF.1000401@voidspace.org.uk>
	<4C0FA854.1080400@egenix.com>
Message-ID: <huoq0q$dmd$2@dough.gmane.org>

On 6/9/2010 10:42 AM, M.-A. Lemburg wrote:

 >> Steve Holden wrote
>>> How does throwing away information represent "moving forward"?

'Closing' a tracker issue does not 'throw away' information', it *adds* 
information as to current intention.

> It's certainly not fair to require all core developers to
> continue working on Python2, but it would also be unfair to
> cancel out that possibility for a subset of interested devs.

Closing a set of issues does not cancel out that possibility. If such a 
subset of devs develops, they can easily reopen (or move) particular 
issues they are interested in working on.


From tjreedy at udel.edu  Wed Jun  9 21:39:04 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 09 Jun 2010 15:39:04 -0400
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C0F7ED8.9000000@egenix.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>	<4C0F7799.10700@gmail.com>
	<4C0F7ED8.9000000@egenix.com>
Message-ID: <huoqkl$dmd$3@dough.gmane.org>

On 6/9/2010 7:45 AM, M.-A. Lemburg wrote:
> Nick Coghlan wrote:
>> On 09/06/10 18:41, M.-A. Lemburg wrote:
>>> The methods to be used will be .transform() for the encode direction
>>> and .untransform() for the decode direction.
>>
>> +1, although adding this for 3.2 would need an exception to the
>> moratorium approved (since it is adding new methods for builtin types).

+1 also. This is neither new syntax, nor, really a new feature.
>
> Good point.
>
> We already discussed these methods in 2008 and Guido
> approved them back then, so perhaps that's a good argument
> for an exception.
>
>> Adding the same-type codecs back even without the helper methods should
>> be fine though (less useful without the helper methods, obviously, but
>> still valid).
>
> Agreed.
>
> The new methods would make it easier to port to Python3, though,
> since e.g. data.encode('hex') is easier to convert to
> data.transform('hex').

That would definitely be a point in favor of getting this in 3.2, with 
appropriate additions to 2to3.


From eric at trueblade.com  Wed Jun  9 21:40:08 2010
From: eric at trueblade.com (Eric Smith)
Date: Wed, 9 Jun 2010 15:40:08 -0400 (EDT)
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <huoq0l$dmd$1@dough.gmane.org>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<871vcghbnu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<huoq0l$dmd$1@dough.gmane.org>
Message-ID: <db73e30e840199c0b23b1702da012c2a.squirrel@mail.trueblade.com>

> On 6/9/2010 4:07 AM, Stephen J. Turnbull wrote:
> Closed issues are not lost. They can still be searched and the result
> downloaded.
>
>> A keyword would do.  Please don't add a status or something like that,
>> though.
>
> I believe Type: feature request; Version: 2.7; Resolution wont fix
> should do fine now. I believe Alexander will use the first two to find
> things to close. Anything else anyone finds could be made to match.

Are there any currently existing issues that match that criteria (feature
request, 2.7, won't fix)?

I don't have good connectivity here so I can't check.

Eric.


From tjreedy at udel.edu  Wed Jun  9 21:45:55 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 09 Jun 2010 15:45:55 -0400
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	...  codecs
In-Reply-To: <20100609141748.733d3e94@pitrou.net>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<20100609133549.578157ed@pitrou.net>	<4C0F7D45.4060706@voidspace.org.uk>	<1276083650.3143.1.camel@localhost.localdomain>	<AANLkTinthUwZTTNuI7Bfsz-oNKsNBhZ68AuBuSwmtvPq@mail.gmail.com>
	<20100609141748.733d3e94@pitrou.net>
Message-ID: <huor1h$i06$1@dough.gmane.org>

On 6/9/2010 8:17 AM, Antoine Pitrou wrote:
> On Wed, 9 Jun 2010 13:57:05 +0200
> Dirkjan Ochtman<dirkjan at ochtman.nl>  wrote:
>> On Wed, Jun 9, 2010 at 13:40, Antoine Pitrou<solipsis at pitrou.net>  wrote:
>>> No, I don't think so. If I'm using hex "encoding", it's because I want
>>> to see a text representation of some arbitrary bytestring (in order to
>>> display it inside another piece of text, for example).
>>> In other words, the purpose of hex is precisely to give a textual
>>> display of non-textual data.
>>
>> Or I want to encode binary data in a non-binary-safe protocol, in
>> which case I probably want bytes.
>
> In this case you would probably choose a more space-efficient
> representation, such as base64 or base85.

Unless the receiver expects hex.

Please, hextext = str(somebytes.tranform('hex')) is quite easy and 
explicit and will work for any bytes to ascii-subset transform, not just 
'hex'.

Keep .transform and .untransform simple by *always* going to/from same 
type.

Terry Jan Reedy


From brett at python.org  Wed Jun  9 21:55:06 2010
From: brett at python.org (Brett Cannon)
Date: Wed, 9 Jun 2010 12:55:06 -0700
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <db73e30e840199c0b23b1702da012c2a.squirrel@mail.trueblade.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com> 
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com> 
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com> 
	<1276064788.2227.122.camel@thinko>
	<871vcghbnu.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<huoq0l$dmd$1@dough.gmane.org>
	<db73e30e840199c0b23b1702da012c2a.squirrel@mail.trueblade.com>
Message-ID: <AANLkTikqmAEE9JtE80V4KnwEByIbR4APRuX485lc-Q1f@mail.gmail.com>

On Wed, Jun 9, 2010 at 12:40, Eric Smith <eric at trueblade.com> wrote:
>> On 6/9/2010 4:07 AM, Stephen J. Turnbull wrote:
>> Closed issues are not lost. They can still be searched and the result
>> downloaded.
>>
>>> A keyword would do. ?Please don't add a status or something like that,
>>> though.
>>
>> I believe Type: feature request; Version: 2.7; Resolution wont fix
>> should do fine now. I believe Alexander will use the first two to find
>> things to close. Anything else anyone finds could be made to match.
>
> Are there any currently existing issues that match that criteria (feature
> request, 2.7, won't fix)?

2.7, closed, wont fix has 27 issues at the moment, which is obviously
small and easy to peruse.

-Brett

>
> I don't have good connectivity here so I can't check.
>
> Eric.
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From solipsis at pitrou.net  Wed Jun  9 21:56:59 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 9 Jun 2010 21:56:59 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	...  codecs
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net>
	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
	<AANLkTinthUwZTTNuI7Bfsz-oNKsNBhZ68AuBuSwmtvPq@mail.gmail.com>
	<20100609141748.733d3e94@pitrou.net> <huor1h$i06$1@dough.gmane.org>
Message-ID: <20100609215659.0ea27cde@pitrou.net>

On Wed, 09 Jun 2010 15:45:55 -0400
Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/9/2010 8:17 AM, Antoine Pitrou wrote:
> > On Wed, 9 Jun 2010 13:57:05 +0200
> > Dirkjan Ochtman<dirkjan at ochtman.nl>  wrote:
> >> On Wed, Jun 9, 2010 at 13:40, Antoine Pitrou<solipsis at pitrou.net>  wrote:
> >>> No, I don't think so. If I'm using hex "encoding", it's because I want
> >>> to see a text representation of some arbitrary bytestring (in order to
> >>> display it inside another piece of text, for example).
> >>> In other words, the purpose of hex is precisely to give a textual
> >>> display of non-textual data.
> >>
> >> Or I want to encode binary data in a non-binary-safe protocol, in
> >> which case I probably want bytes.
> >
> > In this case you would probably choose a more space-efficient
> > representation, such as base64 or base85.
> 
> Unless the receiver expects hex.

In which cases is this true? Hex is rarely used for ASCII-encoding of
binary data, precisely because its efficiency is poor.

> Please, hextext = str(somebytes.tranform('hex')) is quite easy and 
> explicit and will work for any bytes to ascii-subset transform, not just 
> 'hex'.

It will give you the str representation of a bytes object, which is not
what you want.
Of course, hextext = somebytes.tranform('hex').decode('ascii') is not
very hard either. But I disagree with the overall idea that bytes is
the good output type for hex encoding.

Regards

Antoine.


From martin at v.loewis.de  Wed Jun  9 22:13:28 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 09 Jun 2010 22:13:28 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <1276083650.3143.1.camel@localhost.localdomain>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>
	<20100609133549.578157ed@pitrou.net>	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
Message-ID: <4C0FF5E8.60305@v.loewis.de>

>> But in both cases you probably want bytes ->  bytes and str ->  str. If
>> you want text out then put text in, if you want bytes out then put bytes in.
>
> No, I don't think so. If I'm using hex "encoding", it's because I want
> to see a text representation of some arbitrary bytestring (in order to
> display it inside another piece of text, for example).
> In other words, the purpose of hex is precisely to give a textual
> display of non-textual data.

I think this is the way it is for consistency reasons (which I would not 
lightly wish away). I think you agree that base64 is a bytes->bytes
transformation (because you typically use it as a payload on some wire
protocol).

So:

py> binascii.b2a_base64(b'foo')
b'Zm9v\n'
py> binascii.b2a_hex(b'foo')
b'666f6f'

Now, I'd admit that "b2a" may be a misnomer (binary -> ASCII), but then
it may not because ASCII actually *also* implies "bytes" (it's an encoding).

So what would you propose to change: b2a_hex should return a Unicode
string? or this future transform method should return a Unicode string,
whereas the module returns bytes? Something else?

Regards,
Martin

From martin at v.loewis.de  Wed Jun  9 22:18:34 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 09 Jun 2010 22:18:34 +0200
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <1276064788.2227.122.camel@thinko>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
Message-ID: <4C0FF71A.1030702@v.loewis.de>

>
> It might be useful to copy the identifiers and URLs of all the backport
> request tickets into some other repository, or to create some unique
> state in roundup for these.  Rationale: it's almost certain that if the
> existing Python core maintainers won't evolve Python 2.X past 2.7, some
> other group will, and losing existing context for that would kinda suck.

Roundup keeps track of all status changes, see the bottom of an 
arbitrary issue for an example.

So I don't think any additional recording is necessary.

Regards,
Martin

From martin at v.loewis.de  Wed Jun  9 22:23:58 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 09 Jun 2010 22:23:58 +0200
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
Message-ID: <4C0FF85E.9080203@v.loewis.de>

Am 09.06.2010 05:58, schrieb Alexandre Vassalotti:
> Is there is any plan for a 2.8 release? If not, I will go through the
> tracker and close outstanding backport requests of 3.x features to
> 2.x.

Closing the backport requests is fine. For the feature requests, I'd 
only close them *after* the 2.7 release (after determining that they 
won't apply to 3.x, of course).

There aren't that many backport requests, anyway, are there?

Regards,
Martin

From solipsis at pitrou.net  Wed Jun  9 22:26:25 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 9 Jun 2010 22:26:25 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C0FF5E8.60305@v.loewis.de>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net>
	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
	<4C0FF5E8.60305@v.loewis.de>
Message-ID: <20100609222625.43a216f2@pitrou.net>

On Wed, 09 Jun 2010 22:13:28 +0200
"Martin v. L?wis" <martin at v.loewis.de> wrote:
> py> binascii.b2a_base64(b'foo')
> b'Zm9v\n'
> py> binascii.b2a_hex(b'foo')
> b'666f6f'
> 
> Now, I'd admit that "b2a" may be a misnomer (binary -> ASCII), but then
> it may not because ASCII actually *also* implies "bytes" (it's an encoding).
> 
> So what would you propose to change: b2a_hex should return a Unicode
> string? or this future transform method should return a Unicode string,
> whereas the module returns bytes? Something else?

Well, I would propose transform return str whereas b2a_hex returns
bytes. But I agree the consistency argument with b2a_hex looks quite
strong.
(speaking of which, the builtin hex() functions returns str, although
it's purpose is slightly different)

Regards

Antoine.

From steve at holdenweb.com  Thu Jun 10 03:01:00 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 10 Jun 2010 09:01:00 +0800
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <20100609123223.27838ab4@heresy>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>	<huo31u$fbf$1@dough.gmane.org>
	<4C0F91AF.1000401@voidspace.org.uk>	<4C0FA854.1080400@egenix.com>
	<20100609111238.7c017907@heresy>	<19370.1276100000@parc.com>
	<20100609123223.27838ab4@heresy>
Message-ID: <4C10394C.10707@holdenweb.com>

Barry Warsaw wrote:
> On Jun 09, 2010, at 09:13 AM, Bill Janssen wrote:
> 
>> Barry Warsaw <barry at python.org> wrote:
>>
>>> Note that Python 2.7 will be *maintained* for a very long time, which
>>> should satisfy those folks who still require Python 2.  Anybody on
>>> older (and currently unmaintained) versions of Python 2 will not care
>>> about new features so a Python 2.8 wouldn't help them anyway.
>> There are two kinds of new features, though.  Those added to improve (or
>> at any rate modify :-) the product, and those added to keep the product
>> relevant to a changing external world (new operating systems, new
>> communication protocols, etc.)  I think it would take a pretty strong
>> crystal ball to be able to rule out the latter kind of feature add from
>> the 2.x line.
> 
> The latter should mostly be supported by third party packages available in the
> Cheeseshop.  To the extent that such support can't be effected by add-ons
> (e.g. new OS support), I think a better approach would be to encourage and
> allow unofficial ports by utilizing dvcs branches (we *are* moving to
> Mercurial after Python 2.7 final is released, right?).
> 
> I think we should plan on 2.7 being the last Python 2, and spend lots of effort
> to get people onto Python 3, partially by offering big carrots like Unladen
> Swallow, a better/no GIL, etc.  I think it should be part of the PSF's mission
> to help that happen through directed sponsorship, sprints, and other tools.
> 
The current stumbling block isn't the language itself, it's the lack of
support from third-party libraries. GSoC is addressing some of these
issues, but so far we (the PSF, the dev community, anybody else except
R. David Murray) haven't really come to grips with intractable problems
like the broken state of the email package, and we are not doing well at
attracting funds to support it.

So I think we need to address a larger issue than just the language. As
a development community we decided to change the language. Now we have
to do what we can to ensure that the changed language has appropriate
support.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Thu Jun 10 03:01:46 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 10 Jun 2010 09:01:46 +0800
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <huoq0q$dmd$2@dough.gmane.org>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>	<huo31u$fbf$1@dough.gmane.org>	<4C0F91AF.1000401@voidspace.org.uk>	<4C0FA854.1080400@egenix.com>
	<huoq0q$dmd$2@dough.gmane.org>
Message-ID: <4C10397A.9000504@holdenweb.com>

Terry Reedy wrote:
> On 6/9/2010 10:42 AM, M.-A. Lemburg wrote:
> 
>>> Steve Holden wrote
>>>> How does throwing away information represent "moving forward"?
> 
> 'Closing' a tracker issue does not 'throw away' information', it *adds*
> information as to current intention.
> 
>> It's certainly not fair to require all core developers to
>> continue working on Python2, but it would also be unfair to
>> cancel out that possibility for a subset of interested devs.
> 
> Closing a set of issues does not cancel out that possibility. If such a
> subset of devs develops, they can easily reopen (or move) particular
> issues they are interested in working on.
> 
> 
As long as that's the case I am fine with the change.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Thu Jun 10 03:01:46 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 10 Jun 2010 09:01:46 +0800
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <huoq0q$dmd$2@dough.gmane.org>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>	<1276064788.2227.122.camel@thinko>	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>	<huo31u$fbf$1@dough.gmane.org>	<4C0F91AF.1000401@voidspace.org.uk>	<4C0FA854.1080400@egenix.com>
	<huoq0q$dmd$2@dough.gmane.org>
Message-ID: <4C10397A.9000504@holdenweb.com>

Terry Reedy wrote:
> On 6/9/2010 10:42 AM, M.-A. Lemburg wrote:
> 
>>> Steve Holden wrote
>>>> How does throwing away information represent "moving forward"?
> 
> 'Closing' a tracker issue does not 'throw away' information', it *adds*
> information as to current intention.
> 
>> It's certainly not fair to require all core developers to
>> continue working on Python2, but it would also be unfair to
>> cancel out that possibility for a subset of interested devs.
> 
> Closing a set of issues does not cancel out that possibility. If such a
> subset of devs develops, they can easily reopen (or move) particular
> issues they are interested in working on.
> 
> 
As long as that's the case I am fine with the change.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Thu Jun 10 03:02:56 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 10 Jun 2010 09:02:56 +0800
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <20100609101224.4425723d@heresy>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<20100609101224.4425723d@heresy>
Message-ID: <hupdk0$bfd$3@dough.gmane.org>

Barry Warsaw wrote:
> On Jun 09, 2010, at 01:15 AM, Fred Drake wrote:
> 
>> On Wed, Jun 9, 2010 at 12:30 AM, Senthil Kumaran <orsenthil at gmail.com> wrote:
>>> it would still be a good idea to
>>> introduce some of them in minor releases in 2.7. I know, this
>>> deviating from the process, but it could be an option considering that
>>> 2.7 is the last of 2.x release.
>> I disagree.
>>
>> If there are going to be features going into *any* post 2.7.0 version,
>> there's no reason not to increment the revision number to 2.8,
>>
>> Since there's also a well-advertised decision that 2.7 will be the
>> last 2.x, such a 2.8 isn't planned.  But there's no reason to violate
>> the no-features-in-bugfix-releases policy.  We've seen violations
>> cause trouble and confusion, but we've not seen it be successful.
>>
>> The policy wasn't arbitrary; let's stick to it.
> 
> I completely agree with Fred.  New features in point releases will cause many
> more headaches than opening up a 2.8, which I still hope we don't do.  I'd
> rather see all that pent up energy focussed on doing whatever we can to help
> people transition to Python 3.
> 
Though one might ironically suggest that sticking to the policy actually
represents a change in policy :)

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From alexandre at peadrop.com  Thu Jun 10 03:10:31 2010
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 9 Jun 2010 18:10:31 -0700
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <4C0FF85E.9080203@v.loewis.de>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com> 
	<4C0FF85E.9080203@v.loewis.de>
Message-ID: <AANLkTilTKIk8fA7HYo1044Al0v24XF-vZ8411Bb0yBsO@mail.gmail.com>

On Wed, Jun 9, 2010 at 1:23 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Closing the backport requests is fine. For the feature requests, I'd only
> close them *after* the 2.7 release (after determining that they won't apply
> to 3.x, of course).
>
> There aren't that many backport requests, anyway, are there?
>

There is only a few requests (about five).

-- Alexandre

From alexandre at peadrop.com  Thu Jun 10 03:13:32 2010
From: alexandre at peadrop.com (Alexandre Vassalotti)
Date: Wed, 9 Jun 2010 18:13:32 -0700
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTil6Qk-7mkzE8Y_YM6MkyAmYtvQHxEAHxJjLPk8N@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com> 
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com> 
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com> 
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com> 
	<AANLkTil6Qk-7mkzE8Y_YM6MkyAmYtvQHxEAHxJjLPk8N@mail.gmail.com>
Message-ID: <AANLkTin2p0ZwGjU5n1X0WeUJ-Af7G4TQ0x5xejyQkZot@mail.gmail.com>

On Wed, Jun 9, 2010 at 5:55 AM, Facundo Batista
<facundobatista at gmail.com> wrote:
> Yes, closing the tickets as "won't fix" and tagging them as
> "will-never-happen-in-2.x" or something, is the best combination of
> both worlds: it will clean the tracker and ease further developments,
> and will allow anybody to pick up those tickets later.
>

The issue I care about are already tagged as 26backport. So, I don't
think another keyword is needed.

-- Alexandre

From orsenthil at gmail.com  Thu Jun 10 08:48:05 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Thu, 10 Jun 2010 12:18:05 +0530
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTilTKIk8fA7HYo1044Al0v24XF-vZ8411Bb0yBsO@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com> 
	<4C0FF85E.9080203@v.loewis.de>
	<AANLkTilTKIk8fA7HYo1044Al0v24XF-vZ8411Bb0yBsO@mail.gmail.com>
Message-ID: <AANLkTinZdrRl1b2RsO3T_5ZSfIBfXhJLBEYmBYpyVofO@mail.gmail.com>

On Thu, Jun 10, 2010 at 6:40 AM, Alexandre Vassalotti
<alexandre at peadrop.com> wrote:
> On Wed, Jun 9, 2010 at 1:23 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> Closing the backport requests is fine. For the feature requests, I'd only
>> close them *after* the 2.7 release (after determining that they won't apply
>> to 3.x, of course).
>>
>> There aren't that many backport requests, anyway, are there?
>>
>
> There is only a few requests (about five)

I get your point. It is the 'back-ports' that you have tagged. These
were designed for 3.x and implemented in 3.x in the first place.
I was concerned that there will be policy drawn or a practice that
will close any/every existing Feature Request in Python 2.7.
There are some cases (in stdlib) which can debated on the lines of
feature request vs bug-fix and those will get hurt in the process.

Thanks,
Senthil

From stephen at xemacs.org  Thu Jun 10 08:59:48 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 10 Jun 2010 15:59:48 +0900
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	...  codecs
In-Reply-To: <20100609215659.0ea27cde@pitrou.net>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F53B9.2020302@egenix.com> <20100609133549.578157ed@pitrou.net>
	<4C0F7D45.4060706@voidspace.org.uk>
	<1276083650.3143.1.camel@localhost.localdomain>
	<AANLkTinthUwZTTNuI7Bfsz-oNKsNBhZ68AuBuSwmtvPq@mail.gmail.com>
	<20100609141748.733d3e94@pitrou.net> <huor1h$i06$1@dough.gmane.org>
	<20100609215659.0ea27cde@pitrou.net>
Message-ID: <19472.36196.673114.398905@uwakimon.sk.tsukuba.ac.jp>

Antoine Pitrou writes:

 > In which cases is this true? Hex is rarely used for ASCII-encoding of
 > binary data, precisely because its efficiency is poor.

MIME quoted-printable, URL-quoting, and XBM come to mind.


From baptiste13z at free.fr  Thu Jun 10 12:27:33 2010
From: baptiste13z at free.fr (Baptiste Carvello)
Date: Thu, 10 Jun 2010 12:27:33 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <201006090153.14190.victor.stinner@haypocalc.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>
Message-ID: <huqemr$ehh$1@dough.gmane.org>

Victor Stinner a ?crit :
> 
> I suppose that each codec will have a different list of accepted input and 
> output types. Example:
> 
>    bz2: encode:bytes->bytes, decode:bytes->bytes
>    rot13: encode:str->str, decode:str->str
>    hex: encode:bytes->str, decode: str->bytes 

A user point of view: please NO.

This might be more consistent with the semantics, but it forces users to scratch 
their head each time to find out which types are involved. I'd rather all 
methods take and return the same types, independant of codec, that is:

.encode : str->bytes
.decode : bytes->str
.(un)transform : same type, str->str or bytes->bytes

All other uses can be trivially done with .encode('ascii')/.decode('ascii'). 
Changing the type of *ascii* text is easy, understanding bytes vs str semantics 
is not!

Cheers,
B.


From walter at livinglogic.de  Thu Jun 10 12:30:01 2010
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Thu, 10 Jun 2010 12:30:01 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <4C0F8D5A.8010706@gmail.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>	<201006091418.44680.victor.stinner@haypocalc.com>
	<4C0F8D5A.8010706@gmail.com>
Message-ID: <4C10BEA9.4090704@livinglogic.de>

On 09.06.10 14:47, Nick Coghlan wrote:

> On 09/06/10 22:18, Victor Stinner wrote:
>> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a ?crit :
>>> No, .transform() and .untransform() will be interface to same-type
>>> codecs, i.e. ones that convert bytes to bytes or str to str. As with
>>> .encode()/.decode() these helper methods also implement type safety
>>> of the return type.
>>
>> What about buffer compatible objects like array.array(), memoryview(), etc.?
>> Should we use codecs.encode() / codecs.decode() for these types?
> 
> There are probably enough subtleties that this is all worth specifying 
> in a PEP:
> 
> - which codecs from 2.x are to be restored
> - the domain each codec operates in (binary data or text)*
> - review behaviour of codecs.encode and codecs.decode
> - behaviour of the new str, bytes and bytearray (un)transform methods
> - whether to add helper methods for reverse codecs (like base64)
> 
> The PEP would also serve as a reference back to both this discussion and 
> the previous one (which was long enough ago that I've forgotten most of it).

I too think that a PEP is required here.

Codecs support several types of error handling that don't make sense for
transform()/untransform(). What should 'abc'.decode('hex', 'replace')
do? (In 2.6 it raises an assertion error, because errors *must* be strict).

I think we should takt this opportunity to implement
transform/untransform without being burdened with features we inherited
from codecs which don't make sense for transform/untransform.

> [...]

Servus,
   Walter

From mal at egenix.com  Thu Jun 10 13:08:26 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 10 Jun 2010 13:08:26 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <4C10BEA9.4090704@livinglogic.de>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>	<201006091418.44680.victor.stinner@haypocalc.com>	<4C0F8D5A.8010706@gmail.com>
	<4C10BEA9.4090704@livinglogic.de>
Message-ID: <4C10C7AA.9030300@egenix.com>

Walter D?rwald wrote:
> On 09.06.10 14:47, Nick Coghlan wrote:
> 
>> On 09/06/10 22:18, Victor Stinner wrote:
>>> Le mercredi 09 juin 2010 10:41:29, M.-A. Lemburg a ?crit :
>>>> No, .transform() and .untransform() will be interface to same-type
>>>> codecs, i.e. ones that convert bytes to bytes or str to str. As with
>>>> .encode()/.decode() these helper methods also implement type safety
>>>> of the return type.
>>>
>>> What about buffer compatible objects like array.array(), memoryview(), etc.?
>>> Should we use codecs.encode() / codecs.decode() for these types?
>>
>> There are probably enough subtleties that this is all worth specifying 
>> in a PEP:
>>
>> - which codecs from 2.x are to be restored
>> - the domain each codec operates in (binary data or text)*
>> - review behaviour of codecs.encode and codecs.decode
>> - behaviour of the new str, bytes and bytearray (un)transform methods
>> - whether to add helper methods for reverse codecs (like base64)
>>
>> The PEP would also serve as a reference back to both this discussion and 
>> the previous one (which was long enough ago that I've forgotten most of it).
> 
> I too think that a PEP is required here.

Fair enough. I'll write a PEP.

> Codecs support several types of error handling that don't make sense for
> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
> do? (In 2.6 it raises an assertion error, because errors *must* be strict).

That's not really an issue since codecs don't have to implement
all error handling schemes.

For starters, they will all only implement 'strict' mode.

> I think we should takt this opportunity to implement
> transform/untransform without being burdened with features we inherited
> from codecs which don't make sense for transform/untransform.

Not sure what you mean here. Those methods are just helper methods
which interface to the codec system and provide return type safety.
Nothing more or less.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 10 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                38 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From victor.stinner at haypocalc.com  Thu Jun 10 14:16:46 2010
From: victor.stinner at haypocalc.com (Victor Stinner)
Date: Thu, 10 Jun 2010 14:16:46 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C10BEA9.4090704@livinglogic.de>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<4C0F8D5A.8010706@gmail.com> <4C10BEA9.4090704@livinglogic.de>
Message-ID: <201006101416.46500.victor.stinner@haypocalc.com>

Le jeudi 10 juin 2010 12:30:01, Walter D?rwald a ?crit :
> Codecs support several types of error handling that don't make sense for
> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
> do?

You mean 'abc'.transform('hex', 'replace'), right?

Error handler is useful for encoding codecs (the input type is different than 
the output type), but I don't see how it can used with hex, rot13, bz2, ... 
(we decided that .transform() and .untransform() will use the same input and 
output types). Even if bz2+xmlcharref can be something funny :-)

.transform() and .untransform() should have only one argument.

(If you would really like to play with the error handler, you can still use 
codecs.encode(name, errors) and codecs.decode(name, errors).)

.transform() and .untransform() have to be simple. If you want to control the 
codec, why not using directly the real API? Examples:
 - base64.b64encode() has an optional altchars argument
 - bz2.compress() has an optional compresslevel argument
 - etc.

I don't see how altchars or compresslevel can be added to .transform() / 
.untransform(). (**kw would be something really ugly.)

> (In 2.6 it raises an assertion error, because errors *must* be strict)

hex, bz2, rot13, ... codecs should also raise an error if errors is not 
"strict" (or None which means "strict") in Python3.

-- 
Victor Stinner
http://www.haypocalc.com/

From rdmurray at bitdance.com  Thu Jun 10 14:18:08 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Thu, 10 Jun 2010 08:18:08 -0400
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <huqemr$ehh$1@dough.gmane.org>
References: <201006090153.14190.victor.stinner@haypocalc.com>
	<huqemr$ehh$1@dough.gmane.org>
Message-ID: <20100610121808.7D8901FCB52@kimball.webabinitio.net>

On Thu, 10 Jun 2010 12:27:33 +0200, Baptiste Carvello <baptiste13z at free.fr> wrote:
> Victor Stinner wrote:
> 
> > I suppose that each codec will have a different list of accepted input and
> > output types. Example:
> 
> >    bz2: encode:bytes->bytes, decode:bytes->bytes
> >    rot13: encode:str->str, decode:str->str
> >    hex: encode:bytes->str, decode: str->bytes
> 
> A user point of view: please NO.
> 
> This might be more consistent with the semantics, but it forces users to sc=
> ratch =
> 
> their head each time to find out which types are involved. I'd rather all =
> 
> methods take and return the same types, independant of codec, that is:
> 
> .encode : str->bytes
> .decode : bytes->str
> .(un)transform : same type, str->str or bytes->bytes
> 
> All other uses can be trivially done with .encode('ascii')/.decode('ascii').
> 
> Changing the type of *ascii* text is easy, understanding bytes vs str semantics is not!

+1

Consistency in interface is more important in *this* context than the
sensibleness of any particular transform.

--
R. David Murray                                      www.bitdance.com

From barry at python.org  Thu Jun 10 19:51:49 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 10 Jun 2010 13:51:49 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <4C10394C.10707@holdenweb.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<AANLkTindNHNKiifilg8WrL8WzgUZItsStiBd1f6CHBY3@mail.gmail.com>
	<AANLkTil73nqByfrobZdYTA6nRca_uTY4IRAbSqwIq9H4@mail.gmail.com>
	<1276064788.2227.122.camel@thinko>
	<AANLkTinN3AP_P91uHHxfTPRspp_BYvZKhvJ5YkgwZWLC@mail.gmail.com>
	<huo31u$fbf$1@dough.gmane.org> <4C0F91AF.1000401@voidspace.org.uk>
	<4C0FA854.1080400@egenix.com> <20100609111238.7c017907@heresy>
	<19370.1276100000@parc.com> <20100609123223.27838ab4@heresy>
	<4C10394C.10707@holdenweb.com>
Message-ID: <20100610135149.50b3d15a@heresy>

On Jun 10, 2010, at 09:01 AM, Steve Holden wrote:

>The current stumbling block isn't the language itself, it's the lack of
>support from third-party libraries. GSoC is addressing some of these
>issues, but so far we (the PSF, the dev community, anybody else except
>R. David Murray) haven't really come to grips with intractable problems
>like the broken state of the email package, and we are not doing well at
>attracting funds to support it.
>
>So I think we need to address a larger issue than just the language. As
>a development community we decided to change the language. Now we have
>to do what we can to ensure that the changed language has appropriate
>support.

This is exactly my point - I totally agree.  Let's take all that pent up
energy and apply it to porting important libraries to Python 3.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100610/17c377db/attachment-0001.pgp>

From tjreedy at udel.edu  Thu Jun 10 21:25:33 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 10 Jun 2010 15:25:33 -0400
Subject: [Python-Dev] Future of 2.x.
In-Reply-To: <AANLkTinZdrRl1b2RsO3T_5ZSfIBfXhJLBEYmBYpyVofO@mail.gmail.com>
References: <AANLkTikXW_d14y7KIhaR_v80HTN9HjIO4ZIlUkBTBY0F@mail.gmail.com>
	<4C0FF85E.9080203@v.loewis.de>	<AANLkTilTKIk8fA7HYo1044Al0v24XF-vZ8411Bb0yBsO@mail.gmail.com>
	<AANLkTinZdrRl1b2RsO3T_5ZSfIBfXhJLBEYmBYpyVofO@mail.gmail.com>
Message-ID: <hure79$af7$1@dough.gmane.org>

On 6/10/2010 2:48 AM, Senthil Kumaran wrote:
> On Thu, Jun 10, 2010 at 6:40 AM, Alexandre Vassalotti
> <alexandre at peadrop.com>  wrote:
>> On Wed, Jun 9, 2010 at 1:23 PM, "Martin v. L?wis"<martin at v.loewis.de>  wrote:
>>> Closing the backport requests is fine. For the feature requests, I'd only
>>> close them *after* the 2.7 release (after determining that they won't apply
>>> to 3.x, of course).
>>>
>>> There aren't that many backport requests, anyway, are there?
>>>
>>
>> There is only a few requests (about five)
>
> I get your point. It is the 'back-ports' that you have tagged.

Right, things already in 3.x.

 > These
> were designed for 3.x and implemented in 3.x in the first place.
> I was concerned that there will be policy drawn or a practice that
> will close any/every existing Feature Request in Python 2.7.
> There are some cases (in stdlib) which can debated on the lines of
> feature request vs bug-fix and those will get hurt in the process.

I have started going through old open issues tagged with 2.5. Many are 
unclassified. Those that are feature requests that are *plausible* for 
3.2 I am marking as such and retagging for 3.2, *not* closing. (I am 
also marking bug reports as such and asking the OP to test in 2.6/7 and 
maybe 3.1 if I cannot easily do so.)

Ideally, all core/stdlib feature requests should be classified as such 
and tagged for 3.2 or even 3.3) only.

Terry Jan Reedy


From tjreedy at udel.edu  Thu Jun 10 21:31:58 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 10 Jun 2010 15:31:58 -0400
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
	... codecs
In-Reply-To: <4C10C7AA.9030300@egenix.com>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>	<201006091418.44680.victor.stinner@haypocalc.com>	<4C0F8D5A.8010706@gmail.com>	<4C10BEA9.4090704@livinglogic.de>
	<4C10C7AA.9030300@egenix.com>
Message-ID: <hurejc$buv$1@dough.gmane.org>

On 6/10/2010 7:08 AM, M.-A. Lemburg wrote:
> Walter D?rwald wrote:

>>> The PEP would also serve as a reference back to both this discussion and
>>> the previous one (which was long enough ago that I've forgotten most of it).
>>
>> I too think that a PEP is required here.
>
> Fair enough. I'll write a PEP.

Thank you from me.
>
>> Codecs support several types of error handling that don't make sense for
>> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
>> do? (In 2.6 it raises an assertion error, because errors *must* be strict).

I would expext either ValueError: errors arg must be 'strict' for 
trransform or else TypeError: tranform takes 1 arg, 2 given.

> That's not really an issue since codecs don't have to implement
> all error handling schemes.
>
> For starters, they will all only implement 'strict' mode.

Terry Jan Reedy


From walter at livinglogic.de  Fri Jun 11 13:34:37 2010
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 11 Jun 2010 13:34:37 +0200
Subject: [Python-Dev] Reintroduce or drop completly hex, bz2, rot13,
 ... codecs
In-Reply-To: <hurejc$buv$1@dough.gmane.org>
References: <201006090153.14190.victor.stinner@haypocalc.com>	<4C0F53B9.2020302@egenix.com>	<201006091418.44680.victor.stinner@haypocalc.com>	<4C0F8D5A.8010706@gmail.com>	<4C10BEA9.4090704@livinglogic.de>	<4C10C7AA.9030300@egenix.com>
	<hurejc$buv$1@dough.gmane.org>
Message-ID: <4C121F4D.5020206@livinglogic.de>

On 10.06.10 21:31, Terry Reedy wrote:

> On 6/10/2010 7:08 AM, M.-A. Lemburg wrote:
>> Walter D?rwald wrote:
> 
>>>> The PEP would also serve as a reference back to both this discussion and
>>>> the previous one (which was long enough ago that I've forgotten most of it).
>>>
>>> I too think that a PEP is required here.
>>
>> Fair enough. I'll write a PEP.
> 
> Thank you from me.
>>
>>> Codecs support several types of error handling that don't make sense for
>>> transform()/untransform(). What should 'abc'.decode('hex', 'replace')
>>> do? (In 2.6 it raises an assertion error, because errors *must* be strict).
> 
> I would expext either ValueError: errors arg must be 'strict' for 
> trransform

What use is an argument that must always have the same value?

'abc'.transform('hex', errors='strict', obey_the_flufl=True)

> or else TypeError: tranform takes 1 arg, 2 given.

IMHO that's the better option.

>> That's not really an issue since codecs don't have to implement
>> all error handling schemes.
>>
>> For starters, they will all only implement 'strict' mode.

I would prefer it if transformers were separate from codecs and had
their own registry.

Servus,
   Walter

From status at bugs.python.org  Fri Jun 11 18:07:44 2010
From: status at bugs.python.org (Python tracker)
Date: Fri, 11 Jun 2010 18:07:44 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20100611160744.AF7E57816D@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2010-06-04 - 2010-06-11)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2764 open (+54) / 18028 closed (+23) / 20792 total (+77)

Open issues with patches:  1115

Average duration of open issues: 714 days.
Median duration of open issues: 502 days.

Open Issues Breakdown
       open  2737 (+54)
languishing    12 ( +0)
    pending    14 ( +0)

Issues Created Or Reopened (79)
_______________________________

datetime.datetime operator methods are not subclass-friendly   2010-06-09
       http://bugs.python.org/issue2267    reopened belopolsky                           
       patch                                                                   

DeprecationWarning message applies to wrong context with exec( 2010-06-10
       http://bugs.python.org/issue3423    reopened ghazel                               
                                                                               

sunau bytes / str TypeError in Py3k                            2010-06-04
CLOSED http://bugs.python.org/issue8897    created  tjollans                             
       patch                                                                   

The email package should defer to the codecs module for	all al 2010-06-04
       http://bugs.python.org/issue8898    created  r.david.murray                       
       easy                                                                    

Add docstrings to time.struct_time                             2010-06-04
CLOSED http://bugs.python.org/issue8899    created  belopolsky                           
       patch, easy                                                             

IDLE crashes if Preference set to At Startup -> Open Edit Wind 2010-06-04
       http://bugs.python.org/issue8900    created  mhuster                              
                                                                               

Windows registry path not ignored with -E option               2010-06-05
       http://bugs.python.org/issue8901    created  flashk                               
       patch, needs review                                                     

add datetime.time.now() for consistency                        2010-06-05
       http://bugs.python.org/issue8902    created  techtonik                            
                                                                               

Add module level now() and today() functions to datetime modul 2010-06-05
CLOSED http://bugs.python.org/issue8903    created  techtonik                            
                                                                               

quick example how to fix docs                                  2010-06-05
       http://bugs.python.org/issue8904    created  techtonik                            
                                                                               

difflib should accept arbitrary line iterators                 2010-06-05
       http://bugs.python.org/issue8905    created  techtonik                            
                                                                               

Document TestCase attributes in class docstring                2010-06-05
       http://bugs.python.org/issue8906    created  flub                                 
                                                                               

time module documentation differs in trunk and py3k            2010-06-05
CLOSED http://bugs.python.org/issue8907    created  belopolsky                           
       patch                                                                   

friendly errors for UAC misbehavior in windows installers      2010-06-05
       http://bugs.python.org/issue8908    created  techtonik                            
       patch                                                                   

mention bitmap size for bdist_wininst                          2010-06-05
CLOSED http://bugs.python.org/issue8909    created  techtonik                            
       patch                                                                   

Write a text file explaining why Lib/test/data exists          2010-06-06
       http://bugs.python.org/issue8910    created  brett.cannon                         
       patch, easy, needs review                                               

regrtest.main should have a test skipping argument             2010-06-06
       http://bugs.python.org/issue8911    created  brett.cannon                         
       easy                                                                    

`make patchcheck` should check the whitespace of .c/.h files   2010-06-06
       http://bugs.python.org/issue8912    created  brett.cannon                         
                                                                               

Document that datetime.__format__ is datetime.strftime         2010-06-06
       http://bugs.python.org/issue8913    created  brett.cannon                         
       easy                                                                    

Run clang's static analyzer                                    2010-06-06
       http://bugs.python.org/issue8914    created  brett.cannon                         
                                                                               

Use locale.nl_langinfo in _strptime                            2010-06-06
       http://bugs.python.org/issue8915    created  brett.cannon                         
                                                                               

Move PEP 362 (function signature objects) into inspect         2010-06-06
       http://bugs.python.org/issue8916    created  brett.cannon                         
                                                                               

Segmentation error happens in Embedding Python.                2010-06-06
       http://bugs.python.org/issue8917    created  tanaga                               
                                                                               

distutils test failure on solaris: IOError: [Errno 2] No such  2010-06-06
       http://bugs.python.org/issue8918    created  srid                                 
                                                                               

python should read ~/.pythonrc.py by default                   2010-06-06
       http://bugs.python.org/issue8919    created  lesmana                              
                                                                               

PYTHONSTARTUP should expand "~"                                2010-06-06
       http://bugs.python.org/issue8920    created  lesmana                              
                                                                               

2.7rc1: test_ttk failures on OSX 10.4                          2010-06-06
       http://bugs.python.org/issue8921    created  srid                                 
                                                                               

Improve encoding shortcuts in PyUnicode_AsEncodedString()      2010-06-06
CLOSED http://bugs.python.org/issue8922    created  haypo                                
       patch                                                                   

Remove unused "errors" argument	from	_PyUnicode_AsDefaultEncod 2010-06-06
       http://bugs.python.org/issue8923    created  haypo                                
       patch                                                                   

Error in error message in logging                              2010-06-06
       http://bugs.python.org/issue8924    created  PeterL                               
                                                                               

Improve c-api/arg.rst: use "bytes" or "str" types instead	of " 2010-06-06
CLOSED http://bugs.python.org/issue8925    created  haypo                                
       patch                                                                   

getargs.c: release the buffer on error                         2010-06-06
       http://bugs.python.org/issue8926    created  haypo                                
       patch                                                                   

Cannot handle complex requirement resolution                   2010-06-06
       http://bugs.python.org/issue8927    created  dabrahams                            
                                                                               

wininst: could not create key                                  2010-06-06
CLOSED http://bugs.python.org/issue8928    created  techtonik                            
                                                                               

wininst: msvcr90 dependency in x64 build                       2010-06-06
CLOSED http://bugs.python.org/issue8929    created  techtonik                            
                                                                               

messed up formatting after reindenting                         2010-06-06
       http://bugs.python.org/issue8930    created  benjamin.peterson                    
                                                                               

'#' has no affect with 'c' type                                2010-06-06
       http://bugs.python.org/issue8931    created  benjamin.peterson                    
                                                                               

test_capi fails --without-threads                              2010-06-07
CLOSED http://bugs.python.org/issue8932    created  skrah                                
       patch, buildbot                                                         

Invalid detection of metadata version                          2010-06-07
       http://bugs.python.org/issue8933    created  benliles                             
                                                                               

aifc should use str instead of bytes (wave, sunau compatibilit 2010-06-07
       http://bugs.python.org/issue8934    created  tjollans                             
       patch                                                                   

Syntax error in os.py                                          2010-06-08
CLOSED http://bugs.python.org/issue8935    created  sklein                               
                                                                               

webbrowser regression on windows                               2010-06-08
       http://bugs.python.org/issue8936    created  techtonik                            
                                                                               

SimpleHTTPServer should contain usage example                  2010-06-08
       http://bugs.python.org/issue8937    created  techtonik                            
                                                                               

Mac OS  dialogs(Save As..., Load) translation                  2010-06-08
       http://bugs.python.org/issue8938    created  Pavel.Denisow                        
                                                                               

Use C type names (PyUnicode etc;) in the C API docs            2010-06-08
       http://bugs.python.org/issue8939    created  pitrou                               
       patch                                                                   

*HTTPServer need a summary page with API inheritance table     2010-06-08
       http://bugs.python.org/issue8940    created  techtonik                            
                                                                               

utf-32be codec failing on UCS-2 python build for 32-bit	value  2010-06-08
       http://bugs.python.org/issue8941    created  opstad                               
       patch                                                                   

__path__ attribute of modules loaded by zipimporter is unteste 2010-06-08
       http://bugs.python.org/issue8942    created  exarkun                              
                                                                               

Bug in InteractiveConsole                                      2010-06-08
       http://bugs.python.org/issue8943    created  fabioz                               
                                                                               

test_winreg.test_reflection_functions fails on Windows Server  2010-06-08
       http://bugs.python.org/issue8944    created  brian.curtin                         
                                                                               

Bug in **kwds expansion on call?                               2010-06-08
CLOSED http://bugs.python.org/issue8945    created  tjreedy                              
                                                                               

PyBuffer_Release signature in 3.1 documentation is incorrect   2010-06-08
CLOSED http://bugs.python.org/issue8946    created  opstad                               
                                                                               

Provide  as_integer_ratio() method to Decimal                  2010-06-08
       http://bugs.python.org/issue8947    created  belopolsky                           
       patch                                                                   

cleanup functions are not executed with unittest.TestCase.debu 2010-06-08
CLOSED http://bugs.python.org/issue8948    created  michael.foord                        
                                                                               

PyArg_Parse*(): "z" should not accept bytes                    2010-06-08
       http://bugs.python.org/issue8949    created  haypo                                
       patch                                                                   

In getargs.c, make 'L' code raise TypeError for float argument 2010-06-08
CLOSED http://bugs.python.org/issue8950    created  mark.dickinson                       
       patch                                                                   

PyArg_Parse*(): factorize code of 's' and 'z' formats, and 'u' 2010-06-08
       http://bugs.python.org/issue8951    created  haypo                                
       patch                                                                   

Doc/c-api/arg.rst: fix documentation of number formats         2010-06-09
       http://bugs.python.org/issue8952    created  haypo                                
                                                                               

Syntax error in http://docs.python.org/library/decimal.html#re 2010-06-09
CLOSED http://bugs.python.org/issue8953    created  Jean.Jordaan                         
                                                                               

wininst regression: errors when building on linux              2010-06-09
       http://bugs.python.org/issue8954    created  techtonik                            
                                                                               

import doesn't notice changes to working directory             2010-06-09
CLOSED http://bugs.python.org/issue8955    created  purpleidea                           
                                                                               

Incorrect ValueError message for subprocess.Popen.send_signal( 2010-06-09
       http://bugs.python.org/issue8956    created  giampaolo.rodola                     
                                                                               

strptime('%c', ..) fails to parse output of strftime('%c', ..) 2010-06-09
       http://bugs.python.org/issue8957    created  belopolsky                           
                                                                               

2.7rc1 tarfile.py: `bltn_open(targetpath, "wb")` -> IOError: I 2010-06-09
CLOSED http://bugs.python.org/issue8958    created  srid                                 
                                                                               

WINFUNCTYPE wrapped ctypes callbacks not functioning correctly 2010-06-10
       http://bugs.python.org/issue8959    created  mdcurran                             
                                                                               

2.6 README                                                     2010-06-10
       http://bugs.python.org/issue8960    created  vojta.rylko                          
                                                                               

compile Python-2.7rc1 on AIX 5.3 with xlc_r                    2010-06-10
CLOSED http://bugs.python.org/issue8961    created  tgulacsi                             
                                                                               

IOError: [Errno 13] permission denied                          2010-06-10
CLOSED http://bugs.python.org/issue8962    created  Caitlin.Kavanaugh                    
                                                                               

test_urllibnet failure                                         2010-06-10
       http://bugs.python.org/issue8963    created  pitrou                               
       patch                                                                   

Method _sys_version() module Lib\platform.py does parse	correc 2010-06-10
       http://bugs.python.org/issue8964    created  fredericaltorres                     
                                                                               

test_imp fails on OSX when LANG is set                         2010-06-10
CLOSED http://bugs.python.org/issue8965    created  belopolsky                           
       patch                                                                   

ctypes: remove implicit conversion between unicode and bytes   2010-06-10
       http://bugs.python.org/issue8966    created  haypo                                
       patch                                                                   

Create PyErr_GetWindowsMessage() function                      2010-06-11
       http://bugs.python.org/issue8967    created  haypo                                
       patch                                                                   

token type constants are not documented                        2010-06-11
       http://bugs.python.org/issue8968    created  isandler                             
                                                                               

Windows: use (mbcs in) strict mode to encode/decode filenames, 2010-06-11
       http://bugs.python.org/issue8969    created  haypo                                
       patch                                                                   

Tkinter Litmus Test                                            2010-06-11
CLOSED http://bugs.python.org/issue8970    created  rantingrick                          
                                                                               

Tkinter Litmus Test                                            2010-06-11
CLOSED http://bugs.python.org/issue8971    created  rantingrick                          
                                                                               

subprocess.list2cmdline doesn't quote the & character          2010-06-11
       http://bugs.python.org/issue8972    created  shypike                              
                                                                               

Inconsistent docstrings in struct module                       2010-06-11
       http://bugs.python.org/issue8973    created  belopolsky                           
                                                                               

Issues Now Closed (46)
______________________

Confusing error message when dividing timedelta using /        1008 days
       http://bugs.python.org/issue1083    belopolsky                           
       patch                                                                   

"[Errno 11] Resource temporarily unavailable" while using trac  676 days
       http://bugs.python.org/issue3494    tjreedy                              
                                                                               

email.generator.Generator object bytes/str crash - b64encode()  522 days
       http://bugs.python.org/issue4768    r.david.murray                       
       patch                                                                   

msgfmt.py does not work with plural form                        452 days
       http://bugs.python.org/issue5464    loewis                               
                                                                               

tools\msi\merge.py is sensitive to lack of config.py            451 days
       http://bugs.python.org/issue5467    loewis                               
       patch                                                                   

CVE-2008-5983 python: untrusted python modules search path      423 days
       http://bugs.python.org/issue5753    akuchling                            
       patch                                                                   

httplib fails with HEAD requests to pages with "transfer-encod    9 days
       http://bugs.python.org/issue6312    orsenthil                            
       patch                                                                   

Tkinter import fails when running Python.exe from a network sh  327 days
       http://bugs.python.org/issue6470    loewis                               
       patch, needs review                                                     

IDLE (python 3.1.1) syntax coloring for b'bytestring' and u'un  230 days
       http://bugs.python.org/issue7166    taleinat                             
       easy                                                                    

raw_input should encode unicode prompt with std.stdout.encodin  136 days
       http://bugs.python.org/issue7768    naoki                                
                                                                               

[patch] convenience links for subprocess.call()                  82 days
       http://bugs.python.org/issue8151    georg.brandl                         
       patch                                                                   

Unified hash for numeric types.                                  83 days
       http://bugs.python.org/issue8188    mark.dickinson                       
       patch                                                                   

SkipTest exception in setUpClass or setUpModule is marked as a   63 days
       http://bugs.python.org/issue8302    michael.foord                        
                                                                               

Suppress large diffs in unitttest.TestCase.assertSequenceEqual   58 days
       http://bugs.python.org/issue8351    michael.foord                        
       patch                                                                   

automate minidom.unlink() with a context manager                 13 days
       http://bugs.python.org/issue8832    merwok                               
       patch, patch, easy, needs review                                        

PyArg_ParseTuple(): remove "t# format                            13 days
       http://bugs.python.org/issue8839    lemburg                              
       patch                                                                   

Deprecate or remove "U" and "U#" formats of Py_BuildValue()      10 days
       http://bugs.python.org/issue8848    haypo                                
       patch                                                                   

multiprocessing: undefined struct/union member: msg_control       4 days
       http://bugs.python.org/issue8864    loewis                               
                                                                               

--user-access-control=force produces invalid installer on Vist    9 days
       http://bugs.python.org/issue8870    r.david.murray                       
                                                                               

XML-RPC improvement is described twice.                           5 days
       http://bugs.python.org/issue8875    akuchling                            
                                                                               

newline vs. newlines in io module                                 0 days
       http://bugs.python.org/issue8895    r.david.murray                       
                                                                               

sunau bytes / str TypeError in Py3k                               3 days
       http://bugs.python.org/issue8897    haypo                                
       patch                                                                   

Add docstrings to time.struct_time                                1 days
       http://bugs.python.org/issue8899    belopolsky                           
       patch, easy                                                             

Add module level now() and today() functions to datetime modul    6 days
       http://bugs.python.org/issue8903    rhettinger                           
                                                                               

time module documentation differs in trunk and py3k               3 days
       http://bugs.python.org/issue8907    belopolsky                           
       patch                                                                   

mention bitmap size for bdist_wininst                             1 days
       http://bugs.python.org/issue8909    tarek                                
       patch                                                                   

Improve encoding shortcuts in PyUnicode_AsEncodedString()         4 days
       http://bugs.python.org/issue8922    haypo                                
       patch                                                                   

Improve c-api/arg.rst: use "bytes" or "str" types instead	of "    1 days
       http://bugs.python.org/issue8925    haypo                                
       patch                                                                   

wininst: could not create key                                     0 days
       http://bugs.python.org/issue8928    tarek                                
                                                                               

wininst: msvcr90 dependency in x64 build                          0 days
       http://bugs.python.org/issue8929    loewis                               
                                                                               

test_capi fails --without-threads                                 2 days
       http://bugs.python.org/issue8932    skrah                                
       patch, buildbot                                                         

Syntax error in os.py                                             0 days
       http://bugs.python.org/issue8935    ezio.melotti                         
                                                                               

Bug in **kwds expansion on call?                                  0 days
       http://bugs.python.org/issue8945    rhettinger                           
                                                                               

PyBuffer_Release signature in 3.1 documentation is incorrect      0 days
       http://bugs.python.org/issue8946    brian.curtin                         
                                                                               

cleanup functions are not executed with unittest.TestCase.debu    2 days
       http://bugs.python.org/issue8948    michael.foord                        
                                                                               

In getargs.c, make 'L' code raise TypeError for float argument    2 days
       http://bugs.python.org/issue8950    mark.dickinson                       
       patch                                                                   

Syntax error in http://docs.python.org/library/decimal.html#re    0 days
       http://bugs.python.org/issue8953    brian.curtin                         
                                                                               

import doesn't notice changes to working directory                0 days
       http://bugs.python.org/issue8955    r.david.murray                       
                                                                               

2.7rc1 tarfile.py: `bltn_open(targetpath, "wb")` -> IOError: I    2 days
       http://bugs.python.org/issue8958    lars.gustaebel                       
                                                                               

compile Python-2.7rc1 on AIX 5.3 with xlc_r                       1 days
       http://bugs.python.org/issue8961    tgulacsi                             
                                                                               

IOError: [Errno 13] permission denied                             0 days
       http://bugs.python.org/issue8962    mark.dickinson                       
                                                                               

test_imp fails on OSX when LANG is set                            0 days
       http://bugs.python.org/issue8965    belopolsky                           
       patch                                                                   

Tkinter Litmus Test                                               0 days
       http://bugs.python.org/issue8970    merwok                               
                                                                               

Tkinter Litmus Test                                               0 days
       http://bugs.python.org/issue8971    r.david.murray                       
                                                                               

Installing w/o admin generates key error                       2840 days
       http://bugs.python.org/issue600952  tarek                                
                                                                               

xmlrpclib can no longer marshal Fault objects                  1086 days
       http://bugs.python.org/issue1739842 tjreedy                              
                                                                               

Top Issues Most Discussed (10)
______________________________

 18 test_urllibnet failure                                             1 days
open        http://bugs.python.org/issue8963   

 18 Add pure Python implementation of datetime module to	CPython     109 days
open        http://bugs.python.org/issue7989   

 18 datetime lacks concrete tzinfo impl. for UTC                     499 days
open        http://bugs.python.org/issue5094   

 11 crash appending list and namedtuple                               14 days
open        http://bugs.python.org/issue8847   

 11 tarfile/Windows: Don't use mbcs as the default encoding           21 days
open        http://bugs.python.org/issue8784   

  9 test_imp fails on OSX when LANG is set                             0 days
closed      http://bugs.python.org/issue8965   

  8 Improve c-api/arg.rst: use "bytes" or "str" types instead	of "s    1 days
closed      http://bugs.python.org/issue8925   

  8 Popen should raise ValueError if pass a string when shell=False  129 days
open        http://bugs.python.org/issue7839   

  7 Use C type names (PyUnicode etc;) in the C API docs                3 days
open        http://bugs.python.org/issue8939   

  7 Improve encoding shortcuts in PyUnicode_AsEncodedString()          4 days
closed      http://bugs.python.org/issue8922   


From brett at python.org  Sat Jun 12 02:35:22 2010
From: brett at python.org (Brett Cannon)
Date: Fri, 11 Jun 2010 17:35:22 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume self
	when used as a method?
Message-ID: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>

The logging module taught me something today about the difference of a
function defined in C and a function defined in Python::

  import importlib

  class Base:
    def imp(self, name):
        return self.import_(name)

  class CVersion(Base):
    import_ = __import__

  class PyVersion(Base):
    import_ = importlib.__import__

  CFunction().imp('tokenize')
  PyFunction().imp('tokenize')  # Fails!


Turns out the use of __import__ works while the importlib version
fails. Why does importlib fail? Because the first argument to the
importlib.__import__ function is an instance of PyVersion, not a
string. And yet the __import__ version works as if the self argument
is never passed to it!

This "magical" ignoring of self seems to extend to any PyCFunction. Is
this dichotomy intentional or just a "fluke"? Maybe this is a
hold-over from before we had descriptors and staticmethod, but now
that we have these things perhaps this difference should go away.

From benjamin at python.org  Sat Jun 12 02:41:52 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 11 Jun 2010 19:41:52 -0500
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>
Message-ID: <AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>

2010/6/11 Brett Cannon <brett at python.org>:
> This "magical" ignoring of self seems to extend to any PyCFunction. Is
> this dichotomy intentional or just a "fluke"? Maybe this is a
> hold-over from before we had descriptors and staticmethod, but now
> that we have these things perhaps this difference should go away.

There are several open feature requests about this. It is merely
because PyCFunction does not implement __get__.


-- 
Regards,
Benjamin

From guido at python.org  Sat Jun 12 03:30:36 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 11 Jun 2010 18:30:36 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com> 
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>
Message-ID: <AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>

On Fri, Jun 11, 2010 at 5:41 PM, Benjamin Peterson <benjamin at python.org> wrote:
> 2010/6/11 Brett Cannon <brett at python.org>:
>> This "magical" ignoring of self seems to extend to any PyCFunction. Is
>> this dichotomy intentional or just a "fluke"? Maybe this is a
>> hold-over from before we had descriptors and staticmethod, but now
>> that we have these things perhaps this difference should go away.
>
> There are several open feature requests about this. It is merely
> because PyCFunction does not implement __get__.

Yeah, but this of course is because before descriptors only Python
functions were special-cased as methods, and there was known code that
depended on this. I'm sure there's even more code that depends on this
today (because there is just more code, period :-).

Maybe we could offer a decorator that adds a __get__ to a PyCFunction though.

-- 
--Guido van Rossum (python.org/~guido)

From brett at python.org  Sat Jun 12 03:57:05 2010
From: brett at python.org (Brett Cannon)
Date: Fri, 11 Jun 2010 18:57:05 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com> 
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com> 
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>
Message-ID: <AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>

On Fri, Jun 11, 2010 at 18:30, Guido van Rossum <guido at python.org> wrote:
> On Fri, Jun 11, 2010 at 5:41 PM, Benjamin Peterson <benjamin at python.org> wrote:
>> 2010/6/11 Brett Cannon <brett at python.org>:
>>> This "magical" ignoring of self seems to extend to any PyCFunction. Is
>>> this dichotomy intentional or just a "fluke"? Maybe this is a
>>> hold-over from before we had descriptors and staticmethod, but now
>>> that we have these things perhaps this difference should go away.
>>
>> There are several open feature requests about this. It is merely
>> because PyCFunction does not implement __get__.
>
> Yeah, but this of course is because before descriptors only Python
> functions were special-cased as methods, and there was known code that
> depended on this. I'm sure there's even more code that depends on this
> today (because there is just more code, period :-).
>
> Maybe we could offer a decorator that adds a __get__ to a PyCFunction though.

Well, staticmethod seems to work just as well.

I'm going to make this my first request for what to change in Py4K. =)

-Brett

From pjenvey at underboss.org  Sat Jun 12 19:19:07 2010
From: pjenvey at underboss.org (Philip Jenvey)
Date: Sat, 12 Jun 2010 10:19:07 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>
Message-ID: <F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>


On Jun 11, 2010, at 6:57 PM, Brett Cannon wrote:

> On Fri, Jun 11, 2010 at 18:30, Guido van Rossum <guido at python.org> wrote:
>> On Fri, Jun 11, 2010 at 5:41 PM, Benjamin Peterson <benjamin at python.org> wrote:
>>> 2010/6/11 Brett Cannon <brett at python.org>:
>>>> This "magical" ignoring of self seems to extend to any PyCFunction. Is
>>>> this dichotomy intentional or just a "fluke"? Maybe this is a
>>>> hold-over from before we had descriptors and staticmethod, but now
>>>> that we have these things perhaps this difference should go away.
>>> 
>>> There are several open feature requests about this. It is merely
>>> because PyCFunction does not implement __get__.
>> 
>> Yeah, but this of course is because before descriptors only Python
>> functions were special-cased as methods, and there was known code that
>> depended on this. I'm sure there's even more code that depends on this
>> today (because there is just more code, period :-).
>> 
>> Maybe we could offer a decorator that adds a __get__ to a PyCFunction though.
> 
> Well, staticmethod seems to work just as well.
> 
> I'm going to make this my first request for what to change in Py4K. =)

+1 on changing this, it's annoying for alternate implementations. They oftentimes implement functions in pure Python whereas user code might be expecting the PYCFunction behavior. 

Jython's had a couple cases of this incompatibility reported. It's a rare occurrence but it's very mysterious to the user when it happens.

--
Philip Jenvey

From guido at python.org  Sat Jun 12 21:59:54 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 Jun 2010 12:59:54 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com> 
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com> 
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com> 
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com> 
	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
Message-ID: <AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>

On Sat, Jun 12, 2010 at 10:19 AM, Philip Jenvey <pjenvey at underboss.org> wrote:
> +1 on changing this, it's annoying for alternate implementations. They oftentimes implement functions in pure Python whereas user code might be expecting the PYCFunction behavior.
>
> Jython's had a couple cases of this incompatibility reported. It's a rare occurrence but it's very mysterious to the user when it happens.

Well, yeah, but you're presenting an argument *against* changing this
-- existing code will break if it is changed.

I can think of only way out without just breaking such code: Start
issuing warnings when a bare PyCFunction exists at the class level,
and introduce/recommend decorators that can be used to disambiguate
the two possible intended meanings.

As Brett says, f = staticmethod(func) will work to insist on the
existing PyCFunction semantics. We should also introduce a new one
decorator that treats any callable the same way as pure-Python
functions work today: bind the instance to the first argument when it
is called on an instance. I can't think of a good name for that one
right now, but we'll think of one.

I wish the warning could happen at class definition time, but I expect
that there are use cases where the warning is unnecessary (because the
code happens to be structured so as to never call it through the
instance) or even wrong (who knows what introspection might be
thwarted by wrapping something in staticmethod). Perhaps the warning
can be done by adding a __get__ method to PyCFunction that issues the
warning and then returns the original value.

I'm not sure how we'll ever get rid of the warning except in Py4k.

-- 
--Guido van Rossum (python.org/~guido)

From fuzzyman at voidspace.org.uk  Sat Jun 12 23:17:33 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 12 Jun 2010 22:17:33 +0100
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>
	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>
Message-ID: <9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>


On 12 Jun 2010, at 20:59, Guido van Rossum <guido at python.org> wrote:

> On Sat, Jun 12, 2010 at 10:19 AM, Philip Jenvey  
> <pjenvey at underboss.org> wrote:
>> +1 on changing this, it's annoying for alternate implementations.  
>> They oftentimes implement functions in pure Python whereas user  
>> code might be expecting the PYCFunction behavior.
>>
>> Jython's had a couple cases of this incompatibility reported. It's  
>> a rare occurrence but it's very mysterious to the user when it  
>> happens.
>
> Well, yeah, but you're presenting an argument *against* changing this
> -- existing code will break if it is changed.
>
> I can think of only way out without just breaking such code: Start
> issuing warnings when a bare PyCFunction exists at the class level,
> and introduce/recommend decorators that can be used to disambiguate
> the two possible intended meanings.
>
> As Brett says, f = staticmethod(func) will work to insist on the
> existing PyCFunction semantics. We should also introduce a new one
> decorator that treats any callable the same way as pure-Python
> functions work today: bind the instance to the first argument when it
> is called on an instance. I can't think of a good name for that one
> right now, but we'll think of one.
>

method or instancemethod perhaps?

Michael


> I wish the warning could happen at class definition time, but I expect
> that there are use cases where the warning is unnecessary (because the
> code happens to be structured so as to never call it through the
> instance) or even wrong (who knows what introspection might be
> thwarted by wrapping something in staticmethod). Perhaps the warning
> can be done by adding a __get__ method to PyCFunction that issues the
> warning and then returns the original value.
>
> I'm not sure how we'll ever get rid of the warning except in Py4k.
>
> -- 
> --Guido van Rossum (python.org/~guido)
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk

From lists at cheimes.de  Sun Jun 13 01:03:44 2010
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 13 Jun 2010 01:03:44 +0200
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
 self when used as a method?
In-Reply-To: <9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>
	<9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>
Message-ID: <hv13og$u3a$1@dough.gmane.org>

> method or instancemethod perhaps?

The necessary code is already in Python 3.0's code base. I've added in
in r56469 as requested in my issue http://bugs.python.org/issue1587. It
seems we had this very discussion over two and a half year ago.

Index: Python/bltinmodule.c
===================================================================
--- Python/bltinmodule.c        (Revision 81963)
+++ Python/bltinmodule.c        (Arbeitskopie)
@@ -2351,6 +2351,7 @@
     SETBUILTIN("frozenset",             &PyFrozenSet_Type);
     SETBUILTIN("property",              &PyProperty_Type);
     SETBUILTIN("int",                   &PyLong_Type);
+    SETBUILTIN("instancemethod",        &PyInstanceMethod_Type);
     SETBUILTIN("list",                  &PyList_Type);
     SETBUILTIN("map",                   &PyMap_Type);
     SETBUILTIN("object",                &PyBaseObject_Type);


>>> class Example:
...     iid = instancemethod(id)
...     id = id
...
>>> Example().id()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: id() takes exactly one argument (0 given)
>>> Example().iid()
139941157882144

Christian


From guido at python.org  Sun Jun 13 01:15:08 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 Jun 2010 16:15:08 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <hv13og$u3a$1@dough.gmane.org>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com> 
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com> 
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com> 
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com> 
	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com> 
	<9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>
	<hv13og$u3a$1@dough.gmane.org>
Message-ID: <AANLkTilXN5YkwV2kADP-OmJuIM9LCegHpqt85BSB69sq@mail.gmail.com>

Hey! No borrowing the time machine! :-)

On Sat, Jun 12, 2010 at 4:03 PM, Christian Heimes <lists at cheimes.de> wrote:
>> method or instancemethod perhaps?
>
> The necessary code is already in Python 3.0's code base. I've added in
> in r56469 as requested in my issue http://bugs.python.org/issue1587. It
> seems we had this very discussion over two and a half year ago.
>
> Index: Python/bltinmodule.c
> ===================================================================
> --- Python/bltinmodule.c ? ? ? ?(Revision 81963)
> +++ Python/bltinmodule.c ? ? ? ?(Arbeitskopie)
> @@ -2351,6 +2351,7 @@
> ? ? SETBUILTIN("frozenset", ? ? ? ? ? ? &PyFrozenSet_Type);
> ? ? SETBUILTIN("property", ? ? ? ? ? ? ?&PyProperty_Type);
> ? ? SETBUILTIN("int", ? ? ? ? ? ? ? ? ? &PyLong_Type);
> + ? ?SETBUILTIN("instancemethod", ? ? ? ?&PyInstanceMethod_Type);
> ? ? SETBUILTIN("list", ? ? ? ? ? ? ? ? ?&PyList_Type);
> ? ? SETBUILTIN("map", ? ? ? ? ? ? ? ? ? &PyMap_Type);
> ? ? SETBUILTIN("object", ? ? ? ? ? ? ? ?&PyBaseObject_Type);
>
>
>>>> class Example:
> ... ? ? iid = instancemethod(id)
> ... ? ? id = id
> ...
>>>> Example().id()
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: id() takes exactly one argument (0 given)
>>>> Example().iid()
> 139941157882144
>
> Christian
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (python.org/~guido)

From guido at python.org  Sun Jun 13 01:16:17 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 12 Jun 2010 16:16:17 -0700
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
	self when used as a method?
In-Reply-To: <AANLkTilXN5YkwV2kADP-OmJuIM9LCegHpqt85BSB69sq@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com> 
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com> 
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com> 
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com> 
	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com> 
	<9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>
	<hv13og$u3a$1@dough.gmane.org> 
	<AANLkTilXN5YkwV2kADP-OmJuIM9LCegHpqt85BSB69sq@mail.gmail.com>
Message-ID: <AANLkTill7w6D11sbXVR4diQmO5Z6bqTZqluE-4voGae5@mail.gmail.com>

(Of course, I'd still like to see the warning, since it's now a
portability issue.)

On Sat, Jun 12, 2010 at 4:15 PM, Guido van Rossum <guido at python.org> wrote:
> Hey! No borrowing the time machine! :-)
>
> On Sat, Jun 12, 2010 at 4:03 PM, Christian Heimes <lists at cheimes.de> wrote:
>>> method or instancemethod perhaps?
>>
>> The necessary code is already in Python 3.0's code base. I've added in
>> in r56469 as requested in my issue http://bugs.python.org/issue1587. It
>> seems we had this very discussion over two and a half year ago.
>>
>> Index: Python/bltinmodule.c
>> ===================================================================
>> --- Python/bltinmodule.c ? ? ? ?(Revision 81963)
>> +++ Python/bltinmodule.c ? ? ? ?(Arbeitskopie)
>> @@ -2351,6 +2351,7 @@
>> ? ? SETBUILTIN("frozenset", ? ? ? ? ? ? &PyFrozenSet_Type);
>> ? ? SETBUILTIN("property", ? ? ? ? ? ? ?&PyProperty_Type);
>> ? ? SETBUILTIN("int", ? ? ? ? ? ? ? ? ? &PyLong_Type);
>> + ? ?SETBUILTIN("instancemethod", ? ? ? ?&PyInstanceMethod_Type);
>> ? ? SETBUILTIN("list", ? ? ? ? ? ? ? ? ?&PyList_Type);
>> ? ? SETBUILTIN("map", ? ? ? ? ? ? ? ? ? &PyMap_Type);
>> ? ? SETBUILTIN("object", ? ? ? ? ? ? ? ?&PyBaseObject_Type);
>>
>>
>>>>> class Example:
>> ... ? ? iid = instancemethod(id)
>> ... ? ? id = id
>> ...
>>>>> Example().id()
>> Traceback (most recent call last):
>> ?File "<stdin>", line 1, in <module>
>> TypeError: id() takes exactly one argument (0 given)
>>>>> Example().iid()
>> 139941157882144
>>
>> Christian
>>
>> _______________________________________________
>> Python-Dev mailing list
>> Python-Dev at python.org
>> http://mail.python.org/mailman/listinfo/python-dev
>> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>>
>
>
>
> --
> --Guido van Rossum (python.org/~guido)
>


-- 
--Guido van Rossum (python.org/~guido)

From lists at cheimes.de  Sun Jun 13 01:22:11 2010
From: lists at cheimes.de (Christian Heimes)
Date: Sun, 13 Jun 2010 01:22:11 +0200
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
 self when used as a method?
In-Reply-To: <AANLkTilXN5YkwV2kADP-OmJuIM9LCegHpqt85BSB69sq@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>
	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>
	<9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>
	<hv13og$u3a$1@dough.gmane.org>
	<AANLkTilXN5YkwV2kADP-OmJuIM9LCegHpqt85BSB69sq@mail.gmail.com>
Message-ID: <4C1416A3.60004@cheimes.de>

Am 13.06.2010 01:15, schrieb Guido van Rossum:
> Hey! No borrowing the time machine! :-)

Too late, Guido. The keys to the time machine are back at their usual
place. You should hide them better next time. ;)

Christian

From greg.ewing at canterbury.ac.nz  Sun Jun 13 02:30:16 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 13 Jun 2010 12:30:16 +1200
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly
 consume	self when used as a method?
In-Reply-To: <AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>
	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>
	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>
	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>
	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>
	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>
Message-ID: <4C142698.6090209@canterbury.ac.nz>

Guido van Rossum wrote:
> bind the instance to the first argument when it
> is called on an instance. I can't think of a good name for that one
> right now, but we'll think of one.

dynamicmethod?

-- 
Greg

From python-dev at code2develop.com  Mon Jun 14 12:29:08 2010
From: python-dev at code2develop.com (F van der Meeren)
Date: Mon, 14 Jun 2010 12:29:08 +0200
Subject: [Python-Dev] Static linking with libpython.a
Message-ID: <F492A530-25D1-4A96-9E36-855362F61C26@code2develop.com>

Hello,

I am trying to figure out, what files to copy with my app so I am able to initialize the python runtime.
Where can I find information about this?

I am currently targeting Mac OS X 10.5 and above.

Thank you,

Filip

From kristjan at ccpgames.com  Mon Jun 14 14:08:02 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Mon, 14 Jun 2010 12:08:02 +0000
Subject: [Python-Dev] debug and release python
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>

Hello there.
I'm sure this has come up before, but here it is again:

Python exports a different api in debug mode, depending on whether PYMALLOC_DEBUG and WITH_PYMALLOC are exported.  This means that _d.pyd files that are used must have been compiled with a version of python using the same settings for these macros.   It is unfortunate that the _PyObject_DebugMalloc() api is exposed to external applications using macros in objimpl.h

I would suggest two things:

1)      provide dummy or thunking versions of those in builds that don't have PYMALLOC_DEBUG impolemented, that thunk to PyObject_Malloc et al. (This is what we have done at CCP)

2)      Remove the _PyObject_DebugMalloc() from the api.  It really should be an implementation of in the exposed PyObject_Malloc() functions whether they use debug functionality at all.   the _PyObject_DebugCheckAddress and _PyObject_DebugDumpAddress() can be left in place.  But exposing this functionality in macros that external moduled compile in, is not good at all.


The reason why this is annoying:
Some external software comes with proprietary .pyd bindings.  When developing my own application, with modified preprocessor definitions (e.g. to turn off PYMALLOC_DEBUG) we find that those externally provided libraries don't work.  It takes a fair amount of detective work to find out why exactly linkage fails.  The external API really shouldn't change depending on preprocessor definitions.

Cheers,

K

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100614/48a56a41/attachment.html>

From martin at v.loewis.de  Tue Jun 15 00:12:30 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 00:12:30 +0200
Subject: [Python-Dev] debug and release python
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
Message-ID: <4C16A94E.9020101@v.loewis.de>

> Some external software comes with proprietary .pyd bindings.

Can you please explain what a "proprietary .pyd binding" is?

Do you mean they come with extension modules? If so, there is no chance
of using them in debug mode, anyway, right? So what specifically is the
problem?

Regards,
Martin

From alexander.belopolsky at gmail.com  Tue Jun 15 00:45:49 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Jun 2010 18:45:49 -0400
Subject: [Python-Dev] Sharing functions between C extension modules in stdlib
Message-ID: <AANLkTin2Z93cgeryRiZg5G-dFKmJ8eBelRKte3kOnY2w@mail.gmail.com>

I have learned a long time ago that it is not enough to simply declare
a function in some header file if you want to define it in one module
and use in another.  You have to use what now is known as PyCapsule -
an array of pointers to C functions wrapped in a Python object.
However, while navigating through the time/datetime maze recently I
have come across timefuncs.h which seems to share
_PyTime_DoubleToTimet between time and datetime modules.

I did not expect this to work, but apparently the build machinery
somehow knows how to place _PyTime_DoubleToTimet code in both time.so
and datetime.so:


$ nm build/lib.macosx-10.4-x86_64-3.2-pydebug/datetime.so | grep
_PyTime_DoubleToTimet
000000000000f4e2 T __PyTime_DoubleToTimet
$ nm build/lib.macosx-10.4-x86_64-3.2-pydebug/time.so | grep
_PyTime_DoubleToTimet
0000000000000996 T __PyTime_DoubleToTimet

I have two questions: 1) how does this happen; and 2) is this intentional?

Thanks.

From alexander.belopolsky at gmail.com  Tue Jun 15 01:00:19 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Jun 2010 19:00:19 -0400
Subject: [Python-Dev] Sharing functions between C extension modules in
	stdlib
In-Reply-To: <AANLkTin2Z93cgeryRiZg5G-dFKmJ8eBelRKte3kOnY2w@mail.gmail.com>
References: <AANLkTin2Z93cgeryRiZg5G-dFKmJ8eBelRKte3kOnY2w@mail.gmail.com>
Message-ID: <AANLkTinGxh_DIYdEkOU6BjQwogLsbJZ-3fa9oVN71VLQ@mail.gmail.com>

On Mon, Jun 14, 2010 at 6:45 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
..
> I did not expect this to work, but apparently the build machinery
> somehow knows how to place _PyTime_DoubleToTimet code in both time.so
> and datetime.so:
..
> I have two questions: 1) how does this happen; and 2) is this intentional?
>

OK, the answer to the first question is simple: in setup.py, we have

        exts.append( Extension('datetime', ['datetimemodule.c', 'timemodule.c'],
                               libraries=math_libs) )

but if timemodule.c is compiled-in with datetime module, why is does
it also need to be imported to share some other code?

From martin at v.loewis.de  Tue Jun 15 01:09:44 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 01:09:44 +0200
Subject: [Python-Dev] Sharing functions between C extension modules in
 stdlib
In-Reply-To: <AANLkTin2Z93cgeryRiZg5G-dFKmJ8eBelRKte3kOnY2w@mail.gmail.com>
References: <AANLkTin2Z93cgeryRiZg5G-dFKmJ8eBelRKte3kOnY2w@mail.gmail.com>
Message-ID: <4C16B6B8.3030304@v.loewis.de>

> $ nm build/lib.macosx-10.4-x86_64-3.2-pydebug/datetime.so | grep
> _PyTime_DoubleToTimet
> 000000000000f4e2 T __PyTime_DoubleToTimet
> $ nm build/lib.macosx-10.4-x86_64-3.2-pydebug/time.so | grep
> _PyTime_DoubleToTimet
> 0000000000000996 T __PyTime_DoubleToTimet
>
> I have two questions: 1) how does this happen;

'T' means "defined in text segment", so it looks like the code is 
included twice. And indeed, it is:

exts.append( Extension('time', ['timemodule.c'],
                                libraries=math_libs) )
exts.append( Extension('datetime', ['datetimemodule.c', 'timemodule.c'],
                                libraries=math_libs) )

 > and 2) is this intentional?


This was added with

------------------------------------------------------------------------
r36221 | bcannon | 2004-06-24 03:38:47 +0200 (Do, 24. Jun 2004) | 3 Zeilen

Add compilation of timemodule.c with datetimemodule.c to get
__PyTime_DoubleToTimet().

------------------------------------------------------------------------

So it's clearly intentional. I doubt its desirable, though. If only
__PyTime_DoubleToTimet needs to be duplicated, I'd rather put that
function into a separate C file that gets included twice, instead of 
including the full timemodule.c into datetimemodule.c.

Regards,
Martin

From alexander.belopolsky at gmail.com  Tue Jun 15 01:17:41 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 14 Jun 2010 19:17:41 -0400
Subject: [Python-Dev] Static linking with libpython.a
In-Reply-To: <F492A530-25D1-4A96-9E36-855362F61C26@code2develop.com>
References: <F492A530-25D1-4A96-9E36-855362F61C26@code2develop.com>
Message-ID: <AANLkTimr9LfpSqAElTqxVlAhLlwOGgwq0zCtv1NUMkrl@mail.gmail.com>

On Mon, Jun 14, 2010 at 6:29 AM, F van der Meeren
<python-dev at code2develop.com> wrote:
..
> I am trying to figure out, what files to copy with my app so I am able to initialize the python runtime.
> Where can I find information about this?

On comp.lang.python forum.  This forum is for developing python
itself, not applications using python.

However, in general, you need code in Python, Parser, and Objects
directories.  See LIBRARY_OBJS definition in the Makefile.  These days
you also need some bootstrap code from Lib, AFAIK.

From kristjan at ccpgames.com  Tue Jun 15 14:48:39 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Tue, 15 Jun 2010 12:48:39 +0000
Subject: [Python-Dev] debug and release python
In-Reply-To: <4C16A94E.9020101@v.loewis.de>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
	<4C16A94E.9020101@v.loewis.de>
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D8FBEE0@exchis.ccp.ad.local>

What I mean is that a third party software vendor supplies a foobarapp.pyd and a foobarapp_d.pyd dlls that link to python2x.dll and python2x_d.dll respectively.  But the latter will have been compiled to match a certain settings of the objimpl.h header, which may not match whatever is being used to build the local python2x_d.dll.  And thus, you get strange and hard to debug linker errors when trying to load external libraries.

When developing superapp.exe, which uses a custom build of python2x, perhaps even embedded, python2x_d.dll is used extensively both during the development process and the testing process.  This is why foobarapp_d.pyd is necessary and why it is supplied by any sensible vendor providing opaque python extensions.  But the current objimpl.h api makes it a matter of developer choice whether that foobarapp_d.pyd will successfully link with your python2x_d.dll or not.

IMHO, it is not good practice to expose an API that changes depending on preprocessor settings like this.

K

> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
> Sent: 14. j?n? 2010 22:13
> To: Kristj?n Valur J?nsson
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] debug and release python
> 
> > Some external software comes with proprietary .pyd bindings.
> 
> Can you please explain what a "proprietary .pyd binding" is?
> 
> Do you mean they come with extension modules? If so, there is no chance
> of using them in debug mode, anyway, right? So what specifically is the
> problem?
> 
> Regards,
> Martin


From catherine.devlin at gmail.com  Tue Jun 15 21:51:11 2010
From: catherine.devlin at gmail.com (Catherine Devlin)
Date: Tue, 15 Jun 2010 15:51:11 -0400
Subject: [Python-Dev] Become a Python contributor at PyOhio
Message-ID: <AANLkTila7pPtS-zLq1Z5qd7r2NxtHWogV1auMwWbARtu@mail.gmail.com>

Thanks to David Murray, we're going ahead with plans to make a full-fledged
introduction to core development at PyOhio.  We've just started circulating
this announcement to drum up interest, so if there are people or groups who
you'd like to recruit to the effort, please forward it to them.

By the way, I haven't made a peep on this list yet - or even read it -
because I'm intentionally preserving my ignorance so that I can be the
leader-learner for the Teach Me session.  (It's the first time wilful
ignorance has actually been a virtue).

Anyway, the announcement:

Become a Python contributor at PyOhio
=====================================

Working in Python is awesome. Are you ready to work on Python?

The quality of Python and the Standard Library depend on volunteers who fix
bugs and make improvements to the codebase. If you're interested in joining
these volunteers, good for you! Information on core development is right on
Python's homepage.

However, if you'd like an in-person boost to get you started, come to PyOhio
this July 31 - August 3. One of our many events is "Teach Me Python
Bugfixing", an introduction to working on Python that's guaranteed
newbie-friendly (because a newbie is running it). Next come two evenings and
two full days of Python core sprinting, so you can put your new skills to
use with plenty of helpers around.

It's classroom learning and real-life practice at one free event! See you
there!

Core development:  http://www.python.org/dev/
PyOhio:  http://www.pyohio.org/
Teach Me Python Bugfixing:
http://www.pyohio.org/2010/Talks#A.234_Teach_Me_Python_Bugfixing
PyOhio sprints:  http://www.pyohio.org/Sprints2010

-- 
- Catherine
http://catherinedevlin.blogspot.com/
*** PyOhio 2010 * July 31 - Aug 1 * Columbus, OH * pyohio.org ***
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100615/81fcec83/attachment.html>

From martin at v.loewis.de  Tue Jun 15 22:19:57 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 15 Jun 2010 22:19:57 +0200
Subject: [Python-Dev] debug and release python
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D8FBEE0@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>	<4C16A94E.9020101@v.loewis.de>
	<930F189C8A437347B80DF2C156F7EC7F0A8D8FBEE0@exchis.ccp.ad.local>
Message-ID: <4C17E06D.6030601@v.loewis.de>

Am 15.06.2010 14:48, schrieb Kristj?n Valur J?nsson:
> What I mean is that a third party software vendor supplies a
> foobarapp.pyd and a foobarapp_d.pyd dlls that link to python2x.dll
> and python2x_d.dll respectively.  But the latter will have been
> compiled to match a certain settings of the objimpl.h header, which
> may not match whatever is being used to build the local
> python2x_d.dll.  And thus, you get strange and hard to debug linker
> errors when trying to load external libraries.

Ok. But your proposed change doesn't fix that, right?

I.e. even with the change, it would *still* depend on objimpl.h (and 
other) settings what ABI this debug DLL exactly has.

So I think this problem can't really be fixed. Instead, you have to 
trust that the vendor did the most sensible thing when building 
foobarapp.pyd, namely activating *just* the debug build.

Then, if you do the same, it will interoperate just fine.

> IMHO, it is not good practice to expose an API that changes depending
> on preprocessor settings like this.

But there are tons of ABI changes that may happen in a debug build.
If you want to cope with all of them, you really need to recompile the 
source of all extensions.

Regards,
Martin

From amauryfa at gmail.com  Tue Jun 15 22:24:05 2010
From: amauryfa at gmail.com (Amaury Forgeot d'Arc)
Date: Tue, 15 Jun 2010 22:24:05 +0200
Subject: [Python-Dev] debug and release python
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
Message-ID: <AANLkTikYo-eDtFGfMP9NUz8oGI67OBidgsN2W_U_EETl@mail.gmail.com>

2010/6/14 Kristj?n Valur J?nsson <kristjan at ccpgames.com>:
> Hello there.
>
> I?m sure this has come up before, but here it is again:
>
>
>
> Python exports a different api in debug mode, depending on whether
> PYMALLOC_DEBUG and WITH_PYMALLOC are exported.? This means that _d.pyd files
> that are used must have been compiled with a version of python using the
> same settings for these macros.?? It is unfortunate that the
> _PyObject_DebugMalloc() api is exposed to external applications using macros
> in objimpl.h
>
>
>
> I would suggest two things:
>
> 1)????? provide dummy or thunking versions of those in builds that don?t
> have PYMALLOC_DEBUG impolemented, that thunk to PyObject_Malloc et al. (This
> is what we have done at CCP)
>
> 2)????? Remove the _PyObject_DebugMalloc() from the api.? It really should
> be an implementation of in the exposed PyObject_Malloc() functions whether
> they use debug functionality at all.? ?the _PyObject_DebugCheckAddress and
> _PyObject_DebugDumpAddress() can be left in place.? But exposing this
> functionality in macros that external moduled compile in, is not good at
> all.
>
> The reason why this is annoying:
>
> Some external software comes with proprietary .pyd bindings.? When
> developing my own application, with modified preprocessor definitions (e.g.
> to turn off PYMALLOC_DEBUG) we find that those externally provided libraries
> don?t work.? It takes a fair amount of detective work to find out why
> exactly linkage fails.? The external API really shouldn?t change depending
> on preprocessor definitions.

I remember having the same issue years ago:
http://mail.python.org/pipermail/python-list/2004-July/855844.html

At the time, I solved the issue by compiling extension modules with
pymalloc options turned on
(which it fortunately the default, so this applies to the supplied
proprietary .pyd),
and I added a (plain) definition for functions like _PyObject_DebugMalloc,
even when PYMALLOC_DEBUG is undefined.

Since the python_d.dll is a custom build anyway, adding the code is
not too much pain.

-- 
Amaury Forgeot d'Arc

From catherine.devlin at gmail.com  Tue Jun 15 23:07:22 2010
From: catherine.devlin at gmail.com (Catherine Devlin)
Date: Tue, 15 Jun 2010 17:07:22 -0400
Subject: [Python-Dev] Become a Python contributor at PyOhio
In-Reply-To: <20100615203439.GT8876@ag.com>
References: <AANLkTila7pPtS-zLq1Z5qd7r2NxtHWogV1auMwWbARtu@mail.gmail.com>
	<20100615203439.GT8876@ag.com>
Message-ID: <AANLkTin9NkI3kkiEj2yC9FsnTq86wE-wyDd6OYVXYGm6@mail.gmail.com>

On Tue, Jun 15, 2010 at 4:34 PM, Dan Buch <dbuch at ag.com> wrote:

> Does this mean I should repurpose my talk slot, currently entitled
> "Intro to Core Involvement"?  :)
>
> Ach!  I forgot!  Hopefully that's the dumbest mistake I'll make in this
year's PyOhio preparations.  Fortunately the PyCon blog can be edited...
wish emails could be.

No, as I wrote to the talk committee,

"There is some overlap between this talk and Dan Buch's submission, though
his seems to have a heavier focus on doc and triage work.  If they're both
selected, I'll work with Dan to see that the talks dovetail well together.


I would really *love* to see Dan's talk, this talk, and sprints (weekend
sprints AND sprints on the following weekdays) fuse into a great big
contribu-palooza that will put PyOhio on the map!  Well, we're already on
the map."

I actually think it'll be ideal if we can get
- Your talk midday on Saturday, for a clasically planned introduction on
multiple aspects of core involvement
- Shortly thereafter, my "teach me" talk, which will be specifically about
bugfixing and will focus on points of newbie confusion by means of my own
all-natural fumbling.  Hopefully some people from your talk's audience will
take their brand-new knowledge to participate in the "teach me" session as
both teachers and learners... nothing solidifies learning like teaching
does.  (I think I need to *not* watch your talk until afterward on video,
incidentally, to keep my ignorance pure.  I might end up as the most
ignorant person in the room, which would be perfect.)
- That evening, the sprinty goodness begins.

-- 
- Catherine
http://catherinedevlin.blogspot.com/
*** PyOhio 2010 * July 31 - Aug 1 * Columbus, OH * pyohio.org ***
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100615/57bee3d5/attachment-0001.html>

From catherine.devlin at gmail.com  Tue Jun 15 23:08:56 2010
From: catherine.devlin at gmail.com (Catherine Devlin)
Date: Tue, 15 Jun 2010 17:08:56 -0400
Subject: [Python-Dev] Become a Python contributor at PyOhio
Message-ID: <AANLkTikOmXXBFPuoup5I9VxVqc7fOjcYm60_v2b67oti@mail.gmail.com>

So let's try this again:

Become a Python contributor at PyOhio
=====================================
Working in Python is awesome. Are you ready to work on Python?

The quality of Python and the Standard Library depend on volunteers who fix
bugs and make improvements to the codebase. If you're interested in joining
these volunteers, good for you! Information on core development is right on
Python's homepage.

However, if you'd like an in-person boost to get you started, come to PyOhio
this July 31 - August 3. Two talks will get you up to speed on Python
contribution: "Intro to Core Involvement" and "Teach Me Python Bugfixing".
Next come two evenings and two full days of Python core sprinting, so you
can put your new skills to use with plenty of helpers around.

It's classroom learning and real-life practice at one free event! See you
there!

Core development:  http://www.python.org/dev/
PyOhio:  http://www.pyohio.org/
Intro to Core Development:
http://www.pyohio.org/2010/Talks#A.2320_Intro_to_Core_Involvement
Teach Me Python Bugfixing:
http://www.pyohio.org/2010/Talks#A.234_Teach_Me_Python_Bugfixing
PyOhio sprints:  http://www.pyohio.org/Sprints2010

-- 
- Catherine
http://catherinedevlin.blogspot.com/
*** PyOhio 2010 * July 31 - Aug 1 * Columbus, OH * pyohio.org ***
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100615/5bc4f111/attachment.html>

From stefan_ml at behnel.de  Wed Jun 16 09:47:32 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Wed, 16 Jun 2010 09:47:32 +0200
Subject: [Python-Dev] Are PyCFunctions supposed to invisibly consume
 self when used as a method?
In-Reply-To: <4C1416A3.60004@cheimes.de>
References: <AANLkTil13TKkthZRwE3W7-IlDJvwL2kIBfq-KWlpiJYJ@mail.gmail.com>	<AANLkTimR2hDeOlKTab98bQF_-z19t8gjW9dqWMEXvD1t@mail.gmail.com>	<AANLkTikS0-i0t2LfmxlY2sXH8ONQ7stk-9SX4xu4FHji@mail.gmail.com>	<AANLkTinIfyoyvOJAFyJ-JLFLKvtvG_Aprdvp3dMtRH_-@mail.gmail.com>	<F108A2FD-92DF-4C5C-ACD9-9BE5A31B4256@underboss.org>	<AANLkTimzDpaQukdflUuQtZUjifm3HDfAvsIfH6Rzi33o@mail.gmail.com>	<9DD65C5E-7E01-4A1E-B480-588F58782262@voidspace.org.uk>	<hv13og$u3a$1@dough.gmane.org>	<AANLkTilXN5YkwV2kADP-OmJuIM9LCegHpqt85BSB69sq@mail.gmail.com>
	<4C1416A3.60004@cheimes.de>
Message-ID: <hv9vik$70g$1@dough.gmane.org>

Christian Heimes, 13.06.2010 01:22:
> Am 13.06.2010 01:15, schrieb Guido van Rossum:
>> Hey! No borrowing the time machine! :-)
>
> Too late

Get the irony?

Stefan


From kristjan at ccpgames.com  Wed Jun 16 10:35:58 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Wed, 16 Jun 2010 08:35:58 +0000
Subject: [Python-Dev] debug and release python
In-Reply-To: <AANLkTikYo-eDtFGfMP9NUz8oGI67OBidgsN2W_U_EETl@mail.gmail.com>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
	<AANLkTikYo-eDtFGfMP9NUz8oGI67OBidgsN2W_U_EETl@mail.gmail.com>
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D8FC053@exchis.ccp.ad.local>


> -----Original Message-----
> From: Amaury Forgeot d'Arc [mailto:amauryfa at gmail.com]
> Sent: 15. j?n? 2010 21:24
> To: Kristj?n Valur J?nsson
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] debug and release python
> 
> I remember having the same issue years ago:
> http://mail.python.org/pipermail/python-list/2004-July/855844.html
> 
> At the time, I solved the issue by compiling extension modules with
> pymalloc options turned on
> (which it fortunately the default, so this applies to the supplied
> proprietary .pyd),
> and I added a (plain) definition for functions like
> _PyObject_DebugMalloc,
> even when PYMALLOC_DEBUG is undefined.
> 
> Since the python_d.dll is a custom build anyway, adding the code is
> not too much pain.
> 

It is not too much pain, once you realize the problem, no.  But I just got bitten by this and spent the best part of a weekend trying to solve the problem.  On Windows, you get an import failure on the .pyd file with the message: "Procedure entry point not found". I had come across this previously, some three years ago perhaps, and forgotten all about it, so I was sufficiently annoyed to post to python-dev.
We use python27_d.dll a lot and typically have WITH_PYMALLOC disabled in debug build to for the benefit of using the debug malloc libraries present on windows.
I've solved the issue now by making sure that obmalloc.c always exports _PyObject_DebugMalloc(), much as it always exports PyObject_Malloc() whether WITH_PYMALLOC is defined or not.

My suggestion for python core would be the same: expose these always for existing python versions, and remove them from the API in new python versions.

K


From kristjan at ccpgames.com  Wed Jun 16 10:42:07 2010
From: kristjan at ccpgames.com (=?iso-8859-1?Q?Kristj=E1n_Valur_J=F3nsson?=)
Date: Wed, 16 Jun 2010 08:42:07 +0000
Subject: [Python-Dev] debug and release python
In-Reply-To: <4C17E06D.6030601@v.loewis.de>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>
	<4C16A94E.9020101@v.loewis.de>
	<930F189C8A437347B80DF2C156F7EC7F0A8D8FBEE0@exchis.ccp.ad.local>
	<4C17E06D.6030601@v.loewis.de>
Message-ID: <930F189C8A437347B80DF2C156F7EC7F0A8D8FC056@exchis.ccp.ad.local>


> -----Original Message-----
> From: "Martin v. L?wis" [mailto:martin at v.loewis.de]
> Sent: 15. j?n? 2010 21:20
> To: Kristj?n Valur J?nsson
> Cc: python-dev at python.org
> Subject: Re: [Python-Dev] debug and release python
> 
> Am 15.06.2010 14:48, schrieb Kristj?n Valur J?nsson:
> > What I mean is that a third party software vendor supplies a
> > foobarapp.pyd and a foobarapp_d.pyd dlls that link to python2x.dll
> > and python2x_d.dll respectively.  But the latter will have been
> > compiled to match a certain settings of the objimpl.h header, which
> > may not match whatever is being used to build the local
> > python2x_d.dll.  And thus, you get strange and hard to debug linker
> > errors when trying to load external libraries.
> 
> Ok. But your proposed change doesn't fix that, right?
> 
> I.e. even with the change, it would *still* depend on objimpl.h (and
> other) settings what ABI this debug DLL exactly has.
> 
I think it does.
My proposal was perhaps not clear:  For existing python versions, always export _PyObject_DebugMalloc et al. irrespective of the WITH_PYMALLOC and PYMALLOC_DEBUG settings.  (PyObject_Malloc()) is always exported, even for builds without WITH_PYMALLOC)
On new python versions, remove the _PyObject_DebugMalloc from the ABI.  Make the switch internal to obmalloc.c, so that you can turn on the debug library by recompiling pythonxx_d.dll only (currently, you have to recompile the .pyd files too!)

> But there are tons of ABI changes that may happen in a debug build.
> If you want to cope with all of them, you really need to recompile the
> source of all extensions.
Are there?  Can you give me an example?  I thought we were careful to keep the interface shown to pyd files constant regardless of configuration settings.

K

From msenecal.sc at gmail.com  Wed Jun 16 07:45:54 2010
From: msenecal.sc at gmail.com (Mart)
Date: Wed, 16 Jun 2010 01:45:54 -0400
Subject: [Python-Dev] Release manager/developer (10 years + experience)
	would like to help and volunteer time if needed
Message-ID: <601BF624-5D0B-4E2A-A019-2589491AB6D3@gmail.com>

Hi,

I have worked 10 years at Adobe Systems as a Release Developer for the LiveCycle ES team and am now employed as a Release Manager (for a team of one, me ;) ) at Nuance Communications  since last March. I have put lots of effort to keep Python alive and well at Adobe by providing complete build/release solutions & processes, automation and tooling  in my favourite language, Python. I have been promoting, planning and implementing a completely new build/release infrastructure at Nuance, where my expectation is have a 100% python shop to manage builds and releases. 

I would very like to offer any help you may require, provided I am a good fit.  I can provide references, resume, etc.  if requested.

In hopes of pursuing further discussions, please accept my best regards,

Martin Senecal

Gatineau (Quebec)
Canada
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100616/7e5b644e/attachment.html>

From orsenthil at gmail.com  Wed Jun 16 13:20:51 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Wed, 16 Jun 2010 16:50:51 +0530
Subject: [Python-Dev] Release manager/developer (10 years + experience)
	would like to help and volunteer time if needed
Message-ID: <AANLkTikwgaOKa7XxU1lkIT-jEwQeEsMRAlVHGIcQkzkV@mail.gmail.com>

Welcome! You might just want to hook on to the process mentioned at
http://www.python.org/dev That's it.

-- 
Senthil

On 16 Jun 2010 16:44, "Mart" <msenecal.sc at gmail.com> wrote:

Hi,

I have worked 10 years at Adobe Systems as a Release Developer for the
LiveCycle ES team and am now employed as a Release Manager (for a team of
one, me ;) ) at Nuance Communications  since last March. I have put lots of
effort to keep Python alive and well at Adobe by providing complete
build/release solutions & processes, automation and tooling  in my favourite
language, Python. I have been promoting, planning and implementing a
completely new build/release infrastructure at Nuance, where my expectation
is have a 100% python shop to manage builds and releases.

I would very like to offer any help you may require, provided I am a good
fit.  I can provide references, resume, etc.  if requested.

In hopes of pursuing further discussions, please accept my best regards,

Martin Senecal

Gatineau (Quebec)
Canada

_______________________________________________
Python-Dev mailing list
Python-Dev at python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe:
http://mail.python.org/mailman/options/python-dev/orsenthil%40gmail.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100616/ad2daad0/attachment.html>

From ncoghlan at gmail.com  Wed Jun 16 15:19:20 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 16 Jun 2010 23:19:20 +1000
Subject: [Python-Dev] Release manager/developer (10 years + experience)
	would like to help and volunteer time if needed
In-Reply-To: <601BF624-5D0B-4E2A-A019-2589491AB6D3@gmail.com>
References: <601BF624-5D0B-4E2A-A019-2589491AB6D3@gmail.com>
Message-ID: <AANLkTiktSMDEi6NC-I6o6QpzLKZlWfYBeIBIPnzqFxkI@mail.gmail.com>

On Wed, Jun 16, 2010 at 3:45 PM, Mart <msenecal.sc at gmail.com> wrote:
>
> I have put lots of effort to keep Python alive and well at Adobe by providing complete build/release solutions & processes, automation and tooling ?in my favourite language, Python.
> I have been promoting, planning and implementing a completely new build/release infrastructure at Nuance, where my expectation is have a 100% python shop to manage builds and releases.

Hi Martin,

With that kind of background there are likely a number of ways you
could contribute. From a general Python programming point of view, I'd
start with Brett's intro to CPython development at
http://www.python.org/dev/intro/ and the other links in the dev
section of the web site. There are plenty of bug fixes and feature
requests relating to pure Python components of the standard library
that always need work (even comments just saying "I tried this patch
and it worked for me" can be very helpful).

Specifically in the area of automated build and test management,
Martin von Loewis may have some suggestions for improvements that
could be made to our Buildbot infrastructure that he doesn't have the
time to do himself. It may also be worth checking with Dirkjan Ochtman
to see if there is anything in this space that still needs to be
handled for the transition from svn to hg that will hopefully be
happening later this year. With any luck, those two will actually
chime in here (as they're both python-dev subscribers).

We don't go in for automated binary releases for a variety of reasons
- I definitely advise trawling through the python-dev archives for a
while before getting too enthusiastic on that particular front.

Cheers,
Nick.

--
Nick Coghlan ? | ? ncoghlan at gmail.com ? | ? Brisbane, Australia

From msenecal.sc at gmail.com  Wed Jun 16 16:19:39 2010
From: msenecal.sc at gmail.com (Mart)
Date: Wed, 16 Jun 2010 10:19:39 -0400
Subject: [Python-Dev] Release manager/developer (10 years + experience)
	would like to help and volunteer time if needed
In-Reply-To: <AANLkTiktSMDEi6NC-I6o6QpzLKZlWfYBeIBIPnzqFxkI@mail.gmail.com>
References: <601BF624-5D0B-4E2A-A019-2589491AB6D3@gmail.com>
	<AANLkTiktSMDEi6NC-I6o6QpzLKZlWfYBeIBIPnzqFxkI@mail.gmail.com>
Message-ID: <7A14D4CD-0708-4521-BC1F-785D88BDFAFA@gmail.com>

Hi Nick,

That sounds great! I assume since python-dev has been cc'ed that both Martin von Loewis ans Dirkjan Ochtman are listening on this thread. If so, then let me know if there is anything specific that either of you would need a hand with. I would be more than happy to take on some of your "still TODO but no time" items. Meanwhile I will take a closer look @ http://www.python.org/dev/intro and see where/if I can roll up my sleeves and lend a hand. 

Thanks for the reply & info and I look forward to contributing!

Mart :)


On 2010-06-16, at 9:19 AM, Nick Coghlan wrote:

> On Wed, Jun 16, 2010 at 3:45 PM, Mart <msenecal.sc at gmail.com> wrote:
>> 
>> I have put lots of effort to keep Python alive and well at Adobe by providing complete build/release solutions & processes, automation and tooling  in my favourite language, Python.
>> I have been promoting, planning and implementing a completely new build/release infrastructure at Nuance, where my expectation is have a 100% python shop to manage builds and releases.
> 
> Hi Martin,
> 
> With that kind of background there are likely a number of ways you
> could contribute. From a general Python programming point of view, I'd
> start with Brett's intro to CPython development at
> http://www.python.org/dev/intro/ and the other links in the dev
> section of the web site. There are plenty of bug fixes and feature
> requests relating to pure Python components of the standard library
> that always need work (even comments just saying "I tried this patch
> and it worked for me" can be very helpful).
> 
> Specifically in the area of automated build and test management,
> Martin von Loewis may have some suggestions for improvements that
> could be made to our Buildbot infrastructure that he doesn't have the
> time to do himself. It may also be worth checking with Dirkjan Ochtman
> to see if there is anything in this space that still needs to be
> handled for the transition from svn to hg that will hopefully be
> happening later this year. With any luck, those two will actually
> chime in here (as they're both python-dev subscribers).
> 
> We don't go in for automated binary releases for a variety of reasons
> - I definitely advise trawling through the python-dev archives for a
> while before getting too enthusiastic on that particular front.
> 
> Cheers,
> Nick.
> 
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia


From alexander.belopolsky at gmail.com  Wed Jun 16 17:54:06 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 16 Jun 2010 11:54:06 -0400
Subject: [Python-Dev] Sharing functions between C extension modules in
	stdlib
In-Reply-To: <4C16B6B8.3030304@v.loewis.de>
References: <AANLkTin2Z93cgeryRiZg5G-dFKmJ8eBelRKte3kOnY2w@mail.gmail.com>
	<4C16B6B8.3030304@v.loewis.de>
Message-ID: <AANLkTimbviV-ILNt4SsDhbzQ30JX05jA9j5911LP-g3_@mail.gmail.com>

On Mon, Jun 14, 2010 at 7:09 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
..
> So it's clearly intentional. I doubt its desirable, though. If only
> __PyTime_DoubleToTimet needs to be duplicated, I'd rather put that
> function into a separate C file that gets included twice, instead of
> including the full timemodule.c into datetimemodule.c.

Thanks for your research, Martin.  I've opened an issue for this at
http://bugs.python.org/issue9012 .

From lutz at rmi.net  Wed Jun 16 22:48:49 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Wed, 16 Jun 2010 20:48:49 -0000
Subject: [Python-Dev] email package status in 3.X
Message-ID: <6wwifklfk7n7tup216062010044853@SMTP>

[copied to pydev from email-sig because of the broader scope]

Well, it looks like I've stumbled onto the "other shoe" on this
issue--that the email package's problems are also apparently 
behind the fact that CGI binary file uploads don't work in 3.1
(http://bugs.python.org/issue4953).  Yikes.

I trust that people realize this is a show-stopper for broader
Python 3.X adoption.  Why 3.0 was rolled out anyhow is beyond 
me; it seems that it would have been better if Python developers
had gotten their own code to work with 3.X, before expecting the 
world at large to do so.

FWIW, after rewriting Programming Python for 3.1, 3.x still feels
a lot like a beta to me, almost 2 years after its release.  How
did this happen?  Maybe nobody is using 3.X enough to care, but 
I have a feeling that issues like this are part of the reason why.

No offense to people who obviously put in an incredible amount of
work on 3.X.  As someone who remembers 0.X, though, it's hard not
to find the current situation a bit disappointing.

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


> -----Original Message-----
> From: lutz at rmi.net
> To: "R. David Murray" <rdmurray at bitdance.com>
> Subject: Re: email package status in 3.X
> Date: Sun, 13 Jun 2010 15:30:06 -0000
> 
> Come to think of it, here was another oddness I just recalled: this 
> may have been reported already, but header decoding returns mixed types
> depending upon the structure of the header.  Converting to a str for 
> display isn't too difficult to handle, but this seems a bit inconsistent
> and contrary to Python's type neutrality:
> 
> >>> from email.header import decode_header
> >>> S1 = 'Man where did you get that assistant?'
> >>> S2 = '=?utf-8?q?Man_where_did_you_get_that_assistant=3F?='
> >>> S3 = 'Man where did you get that =?UTF-8?Q?assistant=3F?='
> 
> # str: don't decode()
> >>> decode_header(S1)
> [('Man where did you get that assistant?', None)]
> 
> # bytes: do decode()
> >>> decode_header(S2)
> [(b'Man where did you get that assistant?', 'utf-8')]
> 
> # bytes: do decode(), using raw-unicode-escape applied in package
> >>> decode_header(S3)
> [(b'Man where did you get that', None), (b'assistant?', 'utf-8')]
> 
> I can work around this with the following code, but it 
> feels a bit too tightly coupled to the package's internal details
> (further evidence that email.* can be made to work as is today, 
> even if it may be seen as less than ideal aesthetically):
> 
> parts = email.header.decode_header(rawheader)
> decoded = []
> for (part, enc) in parts:                      # for all substrings
>     if enc == None:                            # part unencoded?
>         if not isinstance(part, bytes):        # str: full hdr unencoded
>             decoded += [part]                  # else do unicode decode
>         else:
>             decoded += [part.decode('raw-unicode-escape')]
>     else:
>         decoded += [part.decode(enc)]
> return ' '.join(decoded)
> 
> Thanks,
> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> 
> 
> > -----Original Message-----
> > From: lutz at rmi.net
> > To: "R. David Murray" <rdmurray at bitdance.com>
> > Subject: Re: email package status in 3.X
> > Date: Sat, 12 Jun 2010 16:52:32 -0000
> > 
> > Hi David,
> > 
> > All sounds good, and thanks again for all your work on this.
> > 
> > I appreciate the difficulties of moving this package to 3.X
> > in a backward-compatible way.  My suggestions stem from the fact 
> > that it does work as is today, albeit in a less than ideal way.
> > 
> > That, and I'm seeing that Python 3.X in general is still having
> > a great deal of trouble gaining traction in the "real world" 
> > almost 2 years after its release, and I'd hate to see further 
> > disincentives for people to migrate.  This is a bigger issue
> > than both the email package and this thread, of course.
> > 
> > > > 3) Type-dependent text part encoding
> > > > 
> > > ...
> > > So, in the next releases of Python all MIMEText input should be string,
> > > and it will fail if you pass bytes.  I consider this as email previously
> > > not living up to its published API, but do you think I should hack
> > > in a way for it to accept bytes too, for backward compatibility in the
> > > 3 line?
> > 
> > Decoding can probably be safely delegated to package clients.
> > Typical email clients will probably have str for display of the
> > main text.  They may wish to read attachments in binary mode, but
> > can always read in text mode instead or decode manualy, because 
> > they need a known encoding to send the part correctly (my client 
> > has to ask or use configurations in some cases).
> > 
> > B/W compatibility probably isn't a concern; I suspect that my 
> > temporary workaround will still work with your patch anyhow, 
> > and this code didn't work at all for some encodings before.
> > 
> > > > There are some additional cases that now require decoding per mail 
> > > > headers today due to the str/bytes split, but these are just a 
> > > > normal artifact of supporting Unicode character sets in general,
> > > > ans seem like issues for package client to resolve (e.g., the bytes 
> > > > returned for decoded payloads in 3.X didn't play well with existing 
> > > > str-based text processing code written for 2.X).
> > > 
> > > I'm not following you here.  Can you give me some more specific
> > > examples?  Even if these "normal artifacts" must remain with
> > > the current API, I'd like to make things as easy as practical when
> > > using the new API.
> > 
> > This was just a general statement about things in my own code that
> > didn't jive with the 3.X string model.  For instance, line wrapping 
> > logic assumed str; tkinter text widgets do much better rendering str 
> > than the bytes fetched for decoded payloads; and my Pyedit text editor
> > component had to be overhauled to handle display/edit/save of payloads 
> > of arbitrary encodings.  If I remember any more specific issues with 
> > the email package itself, I'll forward your way.
> > 
> > I'll watch for an opportunity to get the book's new PyMailGUI 
> > client code to you as a candidate test case, but please ping 
> > me about it later if I haven't acted on this.  It works well,
> > but largely because of all the work that went into the email 
> > package underlying it.
> > 
> > Thanks,
> > --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> > 
> > 
> > > -----Original Message-----
> > > From: "R. David Murray" <rdmurray at bitdance.com>
> > > To: lutz at rmi.net
> > > Subject: Re: email package status in 3.X
> > > Date: Thu, 10 Jun 2010 10:18:48 -0400
> > > 
> > > On Thu, 10 Jun 2010 09:21:52 -0400, lutz at rmi.net wrote:
> > > > In other words, some of my concern may have been a bit premature.  
> > > > I hope that in the future we'll either strive for compatibility 
> > > > or keep the current version around; it's a lot of very useful code.
> > > 
> > > The plan is to have a compatibility layer that will accept calls based
> > > on the old API and forward appropriately to the new API.  So far I'm
> > > thinking I can succeed in doing this in a fairly straightforward manner,
> > > but I won't know for sure until I get some more pieces in place.
> > > 
> > > > In fact, I recommend that any new email package be named distinctly, 
> > > 
> > > I'm going to avoid that if I can (though the PyPI package will be
> > > named email6 when we publish it for public testing).  If, however,
> > > it turns out that I can't correctly support both the old and the
> > > new API, then I'll have to do that.
> > > 
> > > > and that the current package be retained for a number of releases to
> > > > come.  After all the breakages that 3.X introduced in general, doing
> > > > the same to any email-based code seems a bit too much, especially 
> > > > given that the current package is largely functional as is.  To me,
> > > > after having just used it extensively, fixing its few issues seems 
> > > > a better approach than starting from scratch.
> > > 
> > > Well, the thing is, as you found, existing 2.x code needs to be fixed to
> > > correctly handle the distinction between strings and bytes no matter what.
> > > The goal is to make it easier to write correct programs, while providing
> > > the compatibility layer to make porting smoother.  But I doubt that any
> > > non-trivial 2.x email program will port without significant changes,
> > > even if the compatibility layer is close to 100% compatible with the
> > > current Python3 email package, simply because the previous conflation
> > > of text and bytes must be untangled in order to work correctly in
> > > Python3, and email involves lots of transitions between text and bytes.
> > > 
> > > As for "starting from scratch", it is true that the current plan involves
> > > considerable changes in the recommended API (in the direction of greater
> > > flexibility and power), but I'm hoping that significant portions of the
> > > code will carry forward with minor changes, and that this will make it
> > > easier to support the old API.
> > > 
> > > > As far as other issues, the things I found are described below my
> > > > signature.  I don't know what the utf-8 issue is that you refer 
> > > > too; I'm able to parse and send with this encoding as is without 
> > > > problems (both payloads and headers), but I'm probably not using the
> > > > interfaces you fixed, and this may be the same as one of item listed.
> > > 
> > > It is, see below.
> > > 
> > > > Another thought: it might be useful to use the book's email client 
> > > > as a sort of test case for the package; it's much more rigorous in 
> > > > the new edition because it now has to be given 3.X'Unicode model 
> > > > (it's abut 4,900 lines of code, though not all is email-related).
> > > > I'd be happy to donate the code as soon as I find out what the 
> > > > copyright will be this time around; it will be at O'Reilly's site
> > > > this Fall in any event.
> > > 
> > > That would be great.  I am planning to write my own sample ap to
> > > demonstrate the new API, but if I can use yours to test the compatibility
> > > layer that will help a lot, since I otherwise have no Python3 email
> > > application to test against unless I port something from Python2.
> > > 
> > > > Major issues I found...
> > > > ------------------------------------------------------------------
> > > > 1) Str required for parsing, but bytes returned from poplib
> > > > 
> > > > The initial decode from bytes to str of full mail text; in 
> > > > retrospect, probably not a major issue, since original email 
> > > > standards called for ASCII.  A 8-bit encoding like Latin-1 is
> > > > probably sufficient for most conforming mails.  For the book,
> > > > I try a set of different encodings, beginning with an optional
> > > > configuration module setting, then ascii, latin-1, and utf-8;
> > > > this is probably overkill, but a GUI has to be defensive.
> > > 
> > > This works (mostly) for conforming email, but some important Python email
> > > applications need to deal with non-conforming email.  That's where the
> > > inability to parse bytes directly really causes problems.
> > > 
> > > > 2) Binary attachments encoding
> > > > 
> > > > The binary attachments byte-to-str issue that you've just
> > > > fixed.  As I mentioned, I worked around this by passing in a 
> > > > custom encoder that calls the original and runs an extra decode
> > > > step.  Here's what my fix looked like in the book; your patch 
> > > > may do better, and I will minimally add a note about the 3.1.3
> > > > and 3.2 fix for this:
> > > 
> > > Yeah, our patch was a lot simpler since we could fix the encoding inside
> > > the loop producing the encoded lines :)
> > > 
> > > > 3) Type-dependent text part encoding
> > > > 
> > > > There's a str/bytes confusion issue related to Unicode encodings
> > > > in text payload generation: some encodings require the payload to
> > > > be str, but others expect bytes.  Unfortunately, this means that 
> > > > clients need to know how the package will react to the encoding 
> > > > that is used, and special-case based upon that.  
> > > 
> > > This was the UTF-8 bug I fixed.  I shouldn't have called it "the UTF-8
> > > bug", because it applies equally to the other charsets that use base64,
> > > as you note.  I called it that because UTF-8 was where the problem was
> > > noticed and is mentioned in the title of the bug report.
> > > 
> > > I had a suspicion that the quoted-printable encoding wasn't being done
> > > correctly either, so to hear that it is working for you is good news.
> > > There may still be bugs to find there, though.
> > > 
> > > So, in the next releases of Python all MIMEText input should be string,
> > > and it will fail if you pass bytes.  I consider this as email previously
> > > not living up to its published API, but do you think I should hack
> > > in a way for it to accept bytes too, for backward compatibility in the
> > > 3 line?
> > > 
> > > > There are some additional cases that now require decoding per mail 
> > > > headers today due to the str/bytes split, but these are just a 
> > > > normal artifact of supporting Unicode character sets in general,
> > > > ans seem like issues for package client to resolve (e.g., the bytes 
> > > > returned for decoded payloads in 3.X didn't play well with existing 
> > > > str-based text processing code written for 2.X).
> > > 
> > > I'm not following you here.  Can you give me some more specific
> > > examples?  Even if these "normal artifacts" must remain with
> > > the current API, I'd like to make things as easy as practical when
> > > using the new API.
> > > 
> > > Thanks for all your feedback!
> > > 
> > > --David
> > > 
> > 
> > 
> > 
> > 
> 


From ncoghlan at gmail.com  Wed Jun 16 23:47:27 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 17 Jun 2010 07:47:27 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP>
References: <6wwifklfk7n7tup216062010044853@SMTP>
Message-ID: <AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>

On Thu, Jun 17, 2010 at 6:48 AM,  <lutz at rmi.net> wrote:
> I trust that people realize this is a show-stopper for broader
> Python 3.X adoption. ?Why 3.0 was rolled out anyhow is beyond
> me; it seems that it would have been better if Python developers
> had gotten their own code to work with 3.X, before expecting the
> world at large to do so.
>
> FWIW, after rewriting Programming Python for 3.1, 3.x still feels
> a lot like a beta to me, almost 2 years after its release. ?How
> did this happen? ?Maybe nobody is using 3.X enough to care, but
> I have a feeling that issues like this are part of the reason why.
>
> No offense to people who obviously put in an incredible amount of
> work on 3.X. ?As someone who remembers 0.X, though, it's hard not
> to find the current situation a bit disappointing.

Agreed, but the binary/text distinction in 2.x (or rather, the lack
thereof) makes the unicode handling situation so hopelessly confused
that there is a lot of 2.x code (including in the standard library)
that silently mixes the two, often without really testing the
consequences (as clearly happened here).

3.x was rolled out anyway because the vast majority of it works.
Obviously people affected by the problems specific to the email
package and any other binary vs text parsing problems that are still
lingering are out of luck at the moment, but leaving 3.x sitting on a
shelf indefinitely would hardly have inspired anyone to clean it up.
My personal perspective is that a lot of that code was likely already
broken in hard to detect ways when dealing with mixed encodings -
releasing 3.x just made the associated errors significantly easier to
detect.

If we end up being able to add your email client code to the standard
library's unit test suite, that should help the situation immensely.

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From martin at v.loewis.de  Thu Jun 17 09:29:01 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Thu, 17 Jun 2010 09:29:01 +0200
Subject: [Python-Dev] debug and release python
In-Reply-To: <930F189C8A437347B80DF2C156F7EC7F0A8D8FC056@exchis.ccp.ad.local>
References: <930F189C8A437347B80DF2C156F7EC7F0A8D73C303@exchis.ccp.ad.local>	<4C16A94E.9020101@v.loewis.de>	<930F189C8A437347B80DF2C156F7EC7F0A8D8FBEE0@exchis.ccp.ad.local>	<4C17E06D.6030601@v.loewis.de>
	<930F189C8A437347B80DF2C156F7EC7F0A8D8FC056@exchis.ccp.ad.local>
Message-ID: <4C19CEBD.9080304@v.loewis.de>


>> I.e. even with the change, it would *still* depend on objimpl.h
>> (and other) settings what ABI this debug DLL exactly has.
>>
> I think it does. My proposal was perhaps not clear:  For existing
> python versions, always export _PyObject_DebugMalloc et al.

Hmm. That's still not clear. What are "existing Python versions"?
You can't change them anymore; any change can only affect future,
as-of-yet-non-existing Python versions.

Also, what do you mean by "always"? Even in release builds?
Would this really help? You shouldn't be mixing PyObject_DebugMalloc
and PyObject_Malloc in a single process.

> On new python versions, remove the
> _PyObject_DebugMalloc from the ABI.  Make the switch internal to
> obmalloc.c, so that you can turn on the debug library by recompiling
> pythonxx_d.dll only (currently, you have to recompile the .pyd files
> too!)

That sounds fine.

>> But there are tons of ABI changes that may happen in a debug
>> build. If you want to cope with all of them, you really need to
>> recompile the source of all extensions.
> Are there?  Can you give me an example?

If you define Py_TRACE_REFS, every object has two additional pointers,
which aren't there if you don't. So extensions compiled with it are 
incompatible with extensions compiled without it.

If you define COUNT_ALLOCS, every type object will have additional 
slots; again, you can't mix extensions that have a different setting 
here than the interpreter.

Regards,
Martin


From barry at python.org  Thu Jun 17 17:43:29 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 17 Jun 2010 11:43:29 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP>
References: <6wwifklfk7n7tup216062010044853@SMTP>
Message-ID: <20100617114329.254db9ac@heresy>

On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:

>Well, it looks like I've stumbled onto the "other shoe" on this
>issue--that the email package's problems are also apparently 
>behind the fact that CGI binary file uploads don't work in 3.1
>(http://bugs.python.org/issue4953).  Yikes.
>
>I trust that people realize this is a show-stopper for broader
>Python 3.X adoption.

We know it, we have extensively discussed how to fix it, we have IMO a good
design, and we even have someone willing and able to tackle the problem.  We
need to find a sufficient source of funding to enable him to do the work it
will take, and so far that's been the biggest stumbling block.  It will take a
focused and determined effort to see this through, and it's obvious that
volunteers cannot make it happen.  I include myself in the latter category, as
I've tried and failed at least twice to do it in my spare time.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100617/db5d7425/attachment.pgp>

From janssen at parc.com  Thu Jun 17 20:11:22 2010
From: janssen at parc.com (Bill Janssen)
Date: Thu, 17 Jun 2010 11:11:22 PDT
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>
Message-ID: <58318.1276798282@parc.com>

Nick Coghlan <ncoghlan at gmail.com> wrote:

> My personal perspective is that a lot of that code was likely already
> broken in hard to detect ways when dealing with mixed encodings -
> releasing 3.x just made the associated errors significantly easier to
> detect.

I have to agree with this, and not just about encodings.  I think much
of the stdlib code dealing with all aspects of HTTP (urllib and the http
package which now includes cgi) is kind of shaky.  And it affects
(infects) other parts of the stdlib, too; sockets are hacked to support
the read-after-close paradigm that httplib uses, for instance.  Which
means that SSL and other socket-using code also has to support it, etc.
Some of this was cleaned up in the move to 3.x, but more work needs to
be done.  Cudos to the folks working on httplib2
(http://code.google.com/p/httplib2/) and WSGI.

There's a related meta-issue having to do with antique protocols.  FTP,
for instance, was designed when the Internet had only 19 nodes connected
together with custom-built refrigerator-sized routers.  A very early
experiment in application protocols.  It does a few odd things that
we've since learned to be inefficient/unwise/unnecessary.  Does it make
sense that Python support every part of it?  On the other hand, it was
fairly static when the Python support was added (unlike HTTP, which was
under very active development!) so that module is pretty robust.

Bill

From brett at python.org  Thu Jun 17 21:24:54 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 17 Jun 2010 12:24:54 -0700
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100617114329.254db9ac@heresy>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy>
Message-ID: <AANLkTimiDfufv_7SaBFrsJb5HnFUur4wzLiqlA0Phk-X@mail.gmail.com>

On Thu, Jun 17, 2010 at 08:43, Barry Warsaw <barry at python.org> wrote:
> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:
>
>>Well, it looks like I've stumbled onto the "other shoe" on this
>>issue--that the email package's problems are also apparently
>>behind the fact that CGI binary file uploads don't work in 3.1
>>(http://bugs.python.org/issue4953). ?Yikes.
>>
>>I trust that people realize this is a show-stopper for broader
>>Python 3.X adoption.
>
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem. ?We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block. ?It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen. ?I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.

And in general I think this is the reason some modules have not
transitioned as well as others: there are only so many of us. The
stdlib passes its test suite, but obviously some unit tests do not
cover enough of the code in the ways people need it covered.

As for using Python 3 for my code, I do and have since Python 3 became
more-or-less usable. I just happen to not work with internet-related
stuff in my day-to-day work.

Plus we have needed to maintain FOUR branches for a while. That is a
nasty time sink when you are having to port bug fixes and such. It
also means that python-dev has been focused on making sure Python 2.7
is a solid release instead of getting to focus on the stdlib in Python
3. This a nasty chicken-and-egg issue; we could ignore Python 2 and
focus on Python 3, but then the community would complain about us not
supporting the transition from 2 to 3 better, but obviously focusing
on 2 has led to 3 not getting enough TLC.

Once Python 2.7 is done and out the door the entire situation for
Python 3 should start to improve as python-dev as whole will have a
chance to begin to focus solely on Python 3.

From g.rodola at gmail.com  Fri Jun 18 00:40:16 2010
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Fri, 18 Jun 2010 00:40:16 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <58318.1276798282@parc.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>
	<58318.1276798282@parc.com>
Message-ID: <AANLkTilyEP0DXlbEutiov6AXVcVhZP1Wk6fgwrYn3Mwq@mail.gmail.com>

2010/6/17 Bill Janssen <janssen at parc.com>:

> There's a related meta-issue having to do with antique protocols.

Can I know what meta-issue are you talking about exactly?

> FTP, for instance, was designed when the Internet had only 19 nodes connected
> together with custom-built refrigerator-sized routers. ?A very early
> experiment in application protocols. ?It does a few odd things that
> we've since learned to be inefficient/unwise/unnecessary. ?Does it make
> sense that Python support every part of it?

Being FTP protocol still quite widespread I'd say it makes a lot of sense.
That aside, what parts of urllib/http* are penalized because of FTP support?


--- Giampaolo
http://code.google.com/p/pyftpdlib
http://code.google.com/p/psutil

From steve at holdenweb.com  Fri Jun 18 04:32:51 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 18 Jun 2010 11:32:51 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100617114329.254db9ac@heresy>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy>
Message-ID: <4C1ADAD3.9070808@holdenweb.com>

Barry Warsaw wrote:
> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:
> 
>> Well, it looks like I've stumbled onto the "other shoe" on this
>> issue--that the email package's problems are also apparently 
>> behind the fact that CGI binary file uploads don't work in 3.1
>> (http://bugs.python.org/issue4953).  Yikes.
>>
>> I trust that people realize this is a show-stopper for broader
>> Python 3.X adoption.
> 
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem.  We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block.  It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen.  I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.
> 
> -Barry
> 
Lest the readership think that the PSF is unaware of this issue, allow
me to point out that we have already partially funded this effort, and
are still offering R. David Murray some further matching funds if he can
raise sponsorship to complete the effort (on which he has made a very
promising start).

We are also attempting to enable tax-deductible fund raising to increase
the likelihood of David's finding support. Perhaps we need to think
about a broader campaign to increase the quality of the python 3
libraries. I find it very annoying that the #python IRC group still has
"Don't use Python 3" in it's topic.  They adamantly refuse to remove it
until there is better library support, and they are the guys who see the
issues day in day out so it is hard to argue with them (and I don't
think an autocratic decision-making process would be appropriate).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Fri Jun 18 04:32:51 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 18 Jun 2010 11:32:51 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100617114329.254db9ac@heresy>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy>
Message-ID: <4C1ADAD3.9070808@holdenweb.com>

Barry Warsaw wrote:
> On Jun 16, 2010, at 08:48 PM, lutz at rmi.net wrote:
> 
>> Well, it looks like I've stumbled onto the "other shoe" on this
>> issue--that the email package's problems are also apparently 
>> behind the fact that CGI binary file uploads don't work in 3.1
>> (http://bugs.python.org/issue4953).  Yikes.
>>
>> I trust that people realize this is a show-stopper for broader
>> Python 3.X adoption.
> 
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem.  We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block.  It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen.  I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.
> 
> -Barry
> 
Lest the readership think that the PSF is unaware of this issue, allow
me to point out that we have already partially funded this effort, and
are still offering R. David Murray some further matching funds if he can
raise sponsorship to complete the effort (on which he has made a very
promising start).

We are also attempting to enable tax-deductible fund raising to increase
the likelihood of David's finding support. Perhaps we need to think
about a broader campaign to increase the quality of the python 3
libraries. I find it very annoying that the #python IRC group still has
"Don't use Python 3" in it's topic.  They adamantly refuse to remove it
until there is better library support, and they are the guys who see the
issues day in day out so it is hard to argue with them (and I don't
think an autocratic decision-making process would be appropriate).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From arcriley at gmail.com  Fri Jun 18 05:16:47 2010
From: arcriley at gmail.com (Arc Riley)
Date: Thu, 17 Jun 2010 23:16:47 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C1ADAD3.9070808@holdenweb.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy> <4C1ADAD3.9070808@holdenweb.com>
Message-ID: <AANLkTilSNPgK2GnutYbxKdIPWZY82v6LFus3fzat7Fzk@mail.gmail.com>

David and his Google Summer of Code student, Shashwat Anand.

You can read Shashwat's weekly progress updates at http://l0nwlf.in/ or
subscribe to http://twitter.com/l0nwlf for more micro updates.

We have more than 30 paid students working on Python 3 tasks this year, most
of them participating under the PSF umbrella but also a few with 3rd party
projects such as Mercurial porting those various packages to Py3.

Given all this "on the horizon" work, I think the Py3 package situation will
look a lot brighter by Python 3.2's release.


On Thu, Jun 17, 2010 at 10:32 PM, Steve Holden <steve at holdenweb.com> wrote:

>
> Lest the readership think that the PSF is unaware of this issue, allow
> me to point out that we have already partially funded this effort, and
> are still offering R. David Murray some further matching funds if he can
> raise sponsorship to complete the effort (on which he has made a very
> promising start).
>
> We are also attempting to enable tax-deductible fund raising to increase
> the likelihood of David's finding support.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100617/1db6a30e/attachment.html>

From stephen at xemacs.org  Fri Jun 18 07:52:17 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 18 Jun 2010 14:52:17 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <6wwifklfk7n7tup216062010044853@SMTP>
References: <6wwifklfk7n7tup216062010044853@SMTP>
Message-ID: <87d3volwfi.fsf@uwakimon.sk.tsukuba.ac.jp>

lutz at rmi.net writes:

 > FWIW, after rewriting Programming Python for 3.1, 3.x still feels
 > a lot like a beta to me, almost 2 years after its release.

Email, of course, is a big wart.  But guess what?  Python 2's email
module doesn't actually work!  Sure, the program runs most of the
time, but every program that depends on email must acquire inches of
armorplate against all the things that can go wrong.  You simply can't
rely on it to DTRT except in a pre-MIME, pre-HTML, ASCII-only world.
Although they're often addressing general problems, these hacks are
*not* integrated back into the email module in most cases, but remain
app-specific voodoo.

If you live in Kansas, sure, you can concentrate on dodging tornados
and completely forget about Unicode and MIME and text/bogus content.
For the rest of the world, though, the problem is not Python 3.  It's
STD 11 (which still points at RFC 822, dated 1982!)  It's really
inappropriate to point at the email module, whose developers are
trying *not* to punt on conformance and robustness, when even the IETF
can only "run in circles, scream and shout"!

Maybe there are other problems with Python 3 that deserve to be
pointed at, but given the general scarcity of resources I think the
email module developers are working on the right things.  Unlike many
other modules, email really needs to be rewritten from the ground
(Python 3) up, because of the centrality of bytes/unicode confusion to
all email problems.  Python 3 completely changes the assumptions
there; a Python 2-style email module really can't work properly.

Then on top of that, today we know a lot more about handling issues
like text/html content and MIME in general than when the Python 2
email module was designed.  New problems have arisen over the period
of Python 3 development, like "domain keys", which email doesn't
handle out of the box AFAIK, but email for Python 3 should IMHO.

Should Python 3 have been held back until email was fixed?  Dunno, but
I personally am very glad it was not; where I have a choice, I always
use Python 3 now, and have yet to run into a problem.  I expect that
to change if I can find the time to get involved in email and Mailman
3 development, of course.<wink>


From stephen at thorne.id.au  Fri Jun 18 07:07:12 2010
From: stephen at thorne.id.au (Stephen Thorne)
Date: Fri, 18 Jun 2010 15:07:12 +1000
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
Message-ID: <20100618050712.GC20639@thorne.id.au>

Steve Holden Wrote:
> We are also attempting to enable tax-deductible fund raising to increase
> the likelihood of David's finding support. Perhaps we need to think
> about a broader campaign to increase the quality of the python 3
> libraries. I find it very annoying that the #python IRC group still has
> "Don't use Python 3" in it's topic.  They adamantly refuse to remove it
> until there is better library support, and they are the guys who see the
> issues day in day out so it is hard to argue with them (and I don't
> think an autocratic decision-making process would be appropriate).

Yes, #python keeps the text "It's too early to use Python 3.x" in its topic.
Library support is the only reason.

-- 
Regards,
Stephen Thorne
Development Engineer

From techtonik at gmail.com  Fri Jun 18 14:44:15 2010
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 18 Jun 2010 15:44:15 +0300
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <20100618050712.GC20639@thorne.id.au>
References: <20100618050712.GC20639@thorne.id.au>
Message-ID: <AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>

On Fri, Jun 18, 2010 at 8:07 AM, Stephen Thorne <stephen at thorne.id.au> wrote:
>> We are also attempting to enable tax-deductible fund raising to increase
>> the likelihood of David's finding support. Perhaps we need to think
>> about a broader campaign to increase the quality of the python 3
>> libraries. I find it very annoying that the #python IRC group still has
>> "Don't use Python 3" in it's topic. ?They adamantly refuse to remove it
>> until there is better library support, and they are the guys who see the
>> issues day in day out so it is hard to argue with them (and I don't
>> think an autocratic decision-making process would be appropriate).
>
> Yes, #python keeps the text "It's too early to use Python 3.x" in its topic.
> Library support is the only reason.

I do not know what are you intending to do, but my opinion that fund
raising for patching library is a waste of money. PSF should
concentrate on enhancing tools to make lives of library supporters
easier. I do not want to become a maintainer, and I believe there was
a lot of spam about this topic from me. The latest thread was in
http://bugs.python.org/issue9008 in short:

`pydotorg` tools - theres is no:
1. separate commit notifications for the module with ability to reply
to dedicated group for review
2. separate bug tracker category for my module with giving users
ability to change every property of it
3. bug tracker timeline for the module that includes ticket changes,
wiki edits, commits and everything else. Filtered.
4. roadmap page with actual status, plans and coverage
5. dashboard page with links to all the above

`python development tools`:
1. no way to get all related code for the module
  1.1. source code location (repository, branches)
  1.2. source code components (source file, tests, documentation)
2. no code coverage (test/user story/rfc/pep)
3. no convenient way to run module-related tests
http://bugs.python.org/issue9027
4. no code review management process
5. no way to notify interested parties

-- 
anatoly t.

From techtonik at gmail.com  Fri Jun 18 15:08:49 2010
From: techtonik at gmail.com (anatoly techtonik)
Date: Fri, 18 Jun 2010 16:08:49 +0300
Subject: [Python-Dev] cmdline arguments in test_support.run_unittest
Message-ID: <AANLkTilB9VlT060bntGcWoQC1K30_HP5JMf6FEo29-zO@mail.gmail.com>

I thought that some arguments to test_support.run_unittest would be useful.
Would like to hear your feedback before making anything.

http://bugs.python.org/issue9028

-- 
anatoly t.

From jnoller at gmail.com  Fri Jun 18 15:19:37 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 18 Jun 2010 09:19:37 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
Message-ID: <AANLkTikD3zOmYZ1zHDJo6m7Ifu2txeTDA_FfOX1AkgGH@mail.gmail.com>

On Fri, Jun 18, 2010 at 8:44 AM, anatoly techtonik <techtonik at gmail.com> wrote:
> On Fri, Jun 18, 2010 at 8:07 AM, Stephen Thorne <stephen at thorne.id.au> wrote:
>>> We are also attempting to enable tax-deductible fund raising to increase
>>> the likelihood of David's finding support. Perhaps we need to think
>>> about a broader campaign to increase the quality of the python 3
>>> libraries. I find it very annoying that the #python IRC group still has
>>> "Don't use Python 3" in it's topic. ?They adamantly refuse to remove it
>>> until there is better library support, and they are the guys who see the
>>> issues day in day out so it is hard to argue with them (and I don't
>>> think an autocratic decision-making process would be appropriate).
>>
>> Yes, #python keeps the text "It's too early to use Python 3.x" in its topic.
>> Library support is the only reason.
>
> I do not know what are you intending to do, but my opinion that fund
> raising for patching library is a waste of money. PSF should
> concentrate on enhancing tools to make lives of library supporters
> easier. I do not want to become a maintainer, and I believe there was
> a lot of spam about this topic from me. The latest thread was in
> http://bugs.python.org/issue9008 in short:

Awesome. I plan on wasting as much money on the useless effort of
moving python 3 forward as humanly possible.

From barry at python.org  Fri Jun 18 15:45:57 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 18 Jun 2010 09:45:57 -0400
Subject: [Python-Dev] [Email-SIG]  email package status in 3.X
In-Reply-To: <4C1ADAD3.9070808@holdenweb.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<20100617114329.254db9ac@heresy> <4C1ADAD3.9070808@holdenweb.com>
Message-ID: <20100618094557.77a07994@heresy>

On Jun 18, 2010, at 11:32 AM, Steve Holden wrote:

>Lest the readership think that the PSF is unaware of this issue, allow
>me to point out that we have already partially funded this effort, and
>are still offering R. David Murray some further matching funds if he can
>raise sponsorship to complete the effort (on which he has made a very
>promising start).

Right, sorry, I didn't mean to imply the PSF isn't doing anything.  More that
we need a coordinated effort among all the companies and organizations that
use Python to help fund Python 3 library development (and not just in the
stdlib).  I think the PSF is best suited to coordinating and managing those
efforts, and through its tax-exempt status, collecting and distributing
donations specifically targeted to Python 3 work.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100618/78d9c318/attachment.pgp>

From steve at pearwood.info  Fri Jun 18 16:09:45 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Sat, 19 Jun 2010 00:09:45 +1000
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTikD3zOmYZ1zHDJo6m7Ifu2txeTDA_FfOX1AkgGH@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<AANLkTikD3zOmYZ1zHDJo6m7Ifu2txeTDA_FfOX1AkgGH@mail.gmail.com>
Message-ID: <201006190009.46122.steve@pearwood.info>

On Fri, 18 Jun 2010 11:19:37 pm Jesse Noller wrote:

> Awesome. I plan on wasting as much money on the useless effort of
> moving python 3 forward as humanly possible.

I'm sorry, but if that's sarcasm, it's far too subtle for me :(


-- 
Steven D'Aprano

From jnoller at gmail.com  Fri Jun 18 16:24:29 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 18 Jun 2010 10:24:29 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <201006190009.46122.steve@pearwood.info>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<AANLkTikD3zOmYZ1zHDJo6m7Ifu2txeTDA_FfOX1AkgGH@mail.gmail.com>
	<201006190009.46122.steve@pearwood.info>
Message-ID: <AANLkTik2tQbDxXFWsKGLbWdTY8Y8qHr5ofjR0Z5crYA4@mail.gmail.com>

On Fri, Jun 18, 2010 at 10:09 AM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Fri, 18 Jun 2010 11:19:37 pm Jesse Noller wrote:
>
>> Awesome. I plan on wasting as much money on the useless effort of
>> moving python 3 forward as humanly possible.
>
> I'm sorry, but if that's sarcasm, it's far too subtle for me :(
>

Yes, it is. See:

http://jessenoller.com/2010/05/20/announcing-python-sprint-sponsorship/

This, in my mind is but a start. Along with RDM's sponsorship for the
email module, the PSF and the community as a whole should be spending
time and money (if they can) to port and help push Python 3 along.
Therefore, I was responding directly to Anatoly's:

"I do not know what are you intending to do, but my opinion that fund
raising for patching library is a waste of money"

To which my response stands: I intend on, based on his opinion, on
wasting as much money as I can.

jesse

From brian.curtin at gmail.com  Fri Jun 18 17:04:31 2010
From: brian.curtin at gmail.com (Brian Curtin)
Date: Fri, 18 Jun 2010 10:04:31 -0500
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
Message-ID: <AANLkTimnL0GfmWULGWfdvxVONdn1MgBh8etl0pChO517@mail.gmail.com>

On Fri, Jun 18, 2010 at 07:44, anatoly techtonik <techtonik at gmail.com>wrote:

> On Fri, Jun 18, 2010 at 8:07 AM, Stephen Thorne <stephen at thorne.id.au>
> wrote:
> >> We are also attempting to enable tax-deductible fund raising to increase
> >> the likelihood of David's finding support. Perhaps we need to think
> >> about a broader campaign to increase the quality of the python 3
> >> libraries. I find it very annoying that the #python IRC group still has
> >> "Don't use Python 3" in it's topic.  They adamantly refuse to remove it
> >> until there is better library support, and they are the guys who see the
> >> issues day in day out so it is hard to argue with them (and I don't
> >> think an autocratic decision-making process would be appropriate).
> >
> > Yes, #python keeps the text "It's too early to use Python 3.x" in its
> topic.
> > Library support is the only reason.
>
> I do not know what are you intending to do, but my opinion that fund
> raising for patching library is a waste of money. PSF should
> concentrate on enhancing tools to make lives of library supporters
> easier. I do not want to become a maintainer, and I believe there was
> a lot of spam about this topic from me. The latest thread was in
> http://bugs.python.org/issue9008 in short:
>
> `pydotorg` tools - theres is no:
> 1. separate commit notifications for the module with ability to reply
> to dedicated group for review


If you know how to set this up, feel free to implement it.


> 2. separate bug tracker category for my module with giving users
> ability to change every property of it
>

The Python bug tracker isn't the place for "my module".
The second part of this sentence has been brought up and I think it's a good
point. For example, those who lack developer privileges can't assign issues
to themselves. I think Twisted's tracker does well in this area, as the
fields are inclusive rather than exclusive. Assignment is open to anyone
willing to work on it, and the field is used to prod the next responsible
person into acting (I think, correct me if I'm wrong).


> 3. bug tracker timeline for the module that includes ticket changes,
> wiki edits, commits and everything else. Filtered.


That seems like information overload. It might be cool to see all of that,
but I'm not sure what the gain is. Some modules get worked on in spurts and
sometimes modules don't see action for months. It doesn't actually mean
anything, though.


> 4. roadmap page with actual status, plans and coverage
> 5. dashboard page with links to all the above
>

If you know how to do this, you are more than welcome to whip up some code
and show how it would help.

`python development tools`:
> 1. no way to get all related code for the module
>  1.1. source code location (repository, branches)
>  1.2. source code components (source file, tests, documentation)
>

What exactly do you mean? Since you have submitted several issues, some with
patches, I have a hard time believing that you've done all of that work
without knowing where any of that information was.


> 2. no code coverage (test/user story/rfc/pep)
>

If you know of a way to incorporate code coverage tools and metrics into the
current process, I believe a number of people would be interested. There
currently exists some coverage tool that runs on the current repository, but
I'm not sure of its location or status.


> 4. no code review management process
>

I agree, this is an area that could use work. It has been suggested that
Rietveld be incorporated into Roundup both visually ("upload to Rietveld"
button) and as a part of the workflow (possible requirement before commit).
As with many of these comments, lack of time and a lack of available
volunteers are two of many answers as to why there is no traction on this.


> 5. no way to notify interested parties
>

I'm not sure what this is specifically addressing.

anatoly t.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100618/91ab25fb/attachment.html>

From lutz at rmi.net  Fri Jun 18 17:09:40 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Fri, 18 Jun 2010 15:09:40 -0000
Subject: [Python-Dev] email package status in 3.X
Message-ID: <cvsjrr4t84x35d3418062010110947@SMTP>

Replying en masse to save bandwidth here...

Barry Warsaw <barry at python.org> writes:
> We know it, we have extensively discussed how to fix it, we have IMO a good
> design, and we even have someone willing and able to tackle the problem.  We
> need to find a sufficient source of funding to enable him to do the work it
> will take, and so far that's been the biggest stumbling block.  It will take a
> focused and determined effort to see this through, and it's obvious that
> volunteers cannot make it happen.  I include myself in the latter category, as
> I've tried and failed at least twice to do it in my spare time.

All understood, and again, not to disparage anyone here.  My 
comments are directed to the development community at large
to underscore the grave p/r problems 3.X faces.

I realize email parsing is a known issue; I also realize that
most people evaluating 3.X today won't care that it is.  Most
will care only that the new version of a language reportedly 
used by Google and YouTube still doesn't support CGI uploads 
a year and a half after its release.  As an author, that's a 
downright horrible story to have to tell the world.


"Stephen J. Turnbull" <stephen at xemacs.org> writes:
> Email, of course, is a big wart.  But guess what?  Python 2's email
> module doesn't actually work! 

Yes it does (see next point).

> If you live in Kansas, sure, you can concentrate on dodging tornados
> and completely forget about Unicode and MIME and text/bogus content.
> For the rest of the world, though, the problem is not Python 3

Yes it is, and Kansas is a lot bigger than you seem to think.

I want to reiterate that I was able to build a feature rich
email client with the email package as it exists in 3.1.  This
includes support on both the receiving and sending sides for HTML,
arbitrary attachments, and decoding and encoding of both text 
payloads and headers according to email, MIME, and Unicode/I18N
standards.  It's an amazingly useful package, and does work as is
in 3.X.  The two main issues I found have been recently fixed.  
It's unfortunate that this package is also the culprit behind CGI
breakage, but it's not clear why it became a critical path for so
much utility in the first place.

The package might not be aesthetically ideal, but to me it 
seems that an utterly incompatible overhaul of this in the name
of supporting potentially very different data streams is a huge
functional overload.  And to those people in Kansas who live 
outside the pydev clique, replacing it with something different 
at this point will look as if an incompatible Python is already 
incompatible with releases in its own line.  Why in the world 
would anyone base a new project on that sort of thrashing?

For my part, I've had to add far too many notes to the upcoming
edition of Programming Python about major pieces of functionality
that worked in 2.X but no longer do in 3.X.  That's disappointing
to me personally, but it will probably seem a lot worse to the
book's tens of thousands of readers.  Yet this is the reality 
that 3.X has created for itself.

> Should Python 3 have been held back until email was fixed?  Dunno, but
> I personally am very glad it was not; where I have a choice, I always
> use Python 3 now, and have yet to run into a problem. 

I guess we'll just have to disagree on that.  IMHO, Python 3 shot
itself in the foot by releasing in half-baked form.  And the 3.0 
I/O speed issue (remember that?) came very close to blowing its 
leg clean off.

The reality out there in Kansas today is that 3.X is perceived as 
so bad that it could very well go the way of POP4 if its story does
not improve.  I don't know what sort of Python world will be left
behind in the wake, but I do know it will probably be much smaller.


Steve Holden <steve at holdenweb.com> writes:
> Lest the readership think that the PSF is unaware of this issue, allow
> me to point out that we have already partially funded this effort, and
> are still offering R. David Murray some further matching funds if he can
> raise sponsorship to complete the effort (on which he has made a very
> promising start).
> 
> We are also attempting to enable tax-deductible fund raising to increase
> the likelihood of David's finding support. Perhaps we need to think
> about a broader campaign to increase the quality of the python 3
> libraries. I find it very annoying that the #python IRC group still has
> "Don't use Python 3" in it's topic.  They adamantly refuse to remove it
> until there is better library support, and they are the guys who see the
> issues day in day out so it is hard to argue with them (and I don't
> think an autocratic decision-making process would be appropriate).

I'm all for people getting paid for work they do, but with all
due respect, I think this underscores part of the problem in 
the Python world today.  If funding had been as stringent a 
prerequisite in the 90s, I doubt there would be a Python today.
It was about the fun and the code, not the bucks and the 
bureaucracy.  As far as I can recall, there was no notion of 
creating a task force to get things done.

Of course, this may just be the natural evolutionary pattern of 
human enterprises.  As it is today, though, the Python community 
has a formal diversity statement, but it still does not have a 
fully functional 3.X almost two years after the fact.  I doubt
that I'm the only one who sees the irony in that.

Again, I mean no disrespect to people contributing to Python 
today on so many fronts, and I have no answers to offer here. 
For better or worse, though, this is a personal issue to me too.
After spending much of the last 2 years updating the best selling 
Python books for all the changes this group has seen fit to make, 
I believe I can say with some authority that 3.X still faces a
very uncertain future.

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


From fuzzyman at voidspace.org.uk  Fri Jun 18 17:31:09 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 18 Jun 2010 16:31:09 +0100
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <cvsjrr4t84x35d3418062010110947@SMTP>
References: <cvsjrr4t84x35d3418062010110947@SMTP>
Message-ID: <4C1B913D.60401@voidspace.org.uk>

On 18/06/2010 16:09, lutz at rmi.net wrote:
> Replying en masse to save bandwidth here...
>
> Barry Warsaw<barry at python.org>  writes:
>    
>> We know it, we have extensively discussed how to fix it, we have IMO a good
>> design, and we even have someone willing and able to tackle the problem.  We
>> need to find a sufficient source of funding to enable him to do the work it
>> will take, and so far that's been the biggest stumbling block.  It will take a
>> focused and determined effort to see this through, and it's obvious that
>> volunteers cannot make it happen.  I include myself in the latter category, as
>> I've tried and failed at least twice to do it in my spare time.
>>      
> All understood, and again, not to disparage anyone here.  My
> comments are directed to the development community at large
> to underscore the grave p/r problems 3.X faces.
>
> I realize email parsing is a known issue; I also realize that
> most people evaluating 3.X today won't care that it is.  Most
> will care only that the new version of a language reportedly
> used by Google and YouTube still doesn't support CGI uploads
> a year and a half after its release.  As an author, that's a
> downright horrible story to have to tell the world.
>
>    

Really? How widely used is the CGI module these days? Maybe there is a 
reason nobody appeared to notice...


> [snip...]
>> Should Python 3 have been held back until email was fixed?  Dunno, but
>> I personally am very glad it was not; where I have a choice, I always
>> use Python 3 now, and have yet to run into a problem.
>>      
> I guess we'll just have to disagree on that.  IMHO, Python 3 shot
> itself in the foot by releasing in half-baked form.  And the 3.0
> I/O speed issue (remember that?) came very close to blowing its
> leg clean off.
>
>    

Whilst I agree that there are plenty of issues to workon, and I don't 
underestimate the difficulty of some of them, I think "half-baked" is 
very much overblown. Whilst you have a lot to say about how much of a 
problem this is I don't understand what you are suggesting be *done*?

Python 3.0 was *declared* to be an experimental release, and by most 
standards 3.1 (in terms of the core language and functionality) was a 
solid release.

Any reasonable expectation about Python 3 adoption predicted that it 
would take years, and would include going through a phase of difficulty 
and disappointment...

All the best,

Michael Foord

-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From status at bugs.python.org  Fri Jun 18 18:09:47 2010
From: status at bugs.python.org (Python tracker)
Date: Fri, 18 Jun 2010 18:09:47 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20100618160947.8D29D7816D@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2010-06-11 - 2010-06-18)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2777 open (+43) / 18070 closed (+12) / 20847 total (+55)

Open issues with patches:  1122

Average duration of open issues: 713 days.
Median duration of open issues: 503 days.

Open Issues Breakdown
       open  2747 (+43)
languishing    13 ( +0)
    pending    16 ( +0)

Issues Created Or Reopened (64)
_______________________________

New SSL module doesn't seem to verify hostname against commonN 2010-06-15
       http://bugs.python.org/issue1589    reopened pitrou                               
                                                                               

struct allows repeat spec. without a format specifier          2010-06-12
CLOSED http://bugs.python.org/issue3129    reopened belopolsky                           
       patch                                                                   

datetime lacks concrete tzinfo implementation for UTC          2010-06-15
CLOSED http://bugs.python.org/issue5094    reopened belopolsky                           
       patch                                                                   

datetime.strptime doesn't support %z format ?                  2010-06-18
       http://bugs.python.org/issue6641    reopened belopolsky                           
       patch                                                                   

libffi update to 3.0.9                                         2010-06-12
       http://bugs.python.org/issue8142    reopened haypo                                
       patch, buildbot                                                         

struct - please make sizes explicit                            2010-06-15
CLOSED http://bugs.python.org/issue8469    reopened mark.dickinson                       
       patch                                                                   

test_distutils fails if srcdir != builddir                     2010-06-15
       http://bugs.python.org/issue8577    reopened pitrou                               
       patch                                                                   

Expose sqlite3 connection inTransaction as read-only in_transa 2010-06-12
       http://bugs.python.org/issue8845    reopened haypo                                
       patch, easy                                                             

Tkinter Litmus Test                                            2010-06-11
CLOSED http://bugs.python.org/issue8971    reopened merwok                               
                                                                               

svnmerge errors in msgfmt.py                                   2010-06-11
       http://bugs.python.org/issue8974    created  merwok                               
       patch                                                                   

Bug in cookiejar                                               2010-06-11
       http://bugs.python.org/issue8975    created  Popa.Claudiu                         
                                                                               

subprocess module causes segmentation fault                    2010-06-11
       http://bugs.python.org/issue8976    created  Chris.Blazick                        
                                                                               

Globalize lonely augmented assignment                          2010-06-11
CLOSED http://bugs.python.org/issue8977    created  serprex                              
       patch                                                                   

"tarfile.ReadError: file could not be opened successfully" if  2010-06-11
       http://bugs.python.org/issue8978    created  flox                                 
                                                                               

OptParse __getitem__                                           2010-06-12
CLOSED http://bugs.python.org/issue8979    created  bcward                               
                                                                               

distutils.tests.test_register.RegisterTestCase.test_strict fai 2010-06-12
       http://bugs.python.org/issue8980    created  Arfrever                             
       patch                                                                   

_struct.__version__ should be string, not bytes                2010-06-12
CLOSED http://bugs.python.org/issue8981    created  belopolsky                           
       easy                                                                    

argparse docs cross reference Namespace as a class but the Nam 2010-06-12
       http://bugs.python.org/issue8982    created  r.david.murray                       
                                                                               

Docstrings should refer to help(name), not name.__doc__        2010-06-12
       http://bugs.python.org/issue8983    created  belopolsky                           
       patch                                                                   

Python 3 doesn't register script arguments                     2010-06-12
CLOSED http://bugs.python.org/issue8984    created  Sworddragon                          
                                                                               

String format() has problems parsing numeric indexes           2010-06-12
CLOSED http://bugs.python.org/issue8985    created  gosella                              
                                                                               

math.erfc OverflowError                                        2010-06-12
CLOSED http://bugs.python.org/issue8986    created  debatem1                             
       patch                                                                   

Distutils doesn't quote Windows command lines properly         2010-06-13
       http://bugs.python.org/issue8987    created  mgiuca                               
       patch                                                                   

import + coding = failure (3.1.2/win32)                        2010-06-13
       http://bugs.python.org/issue8988    created  gonegown                             
                                                                               

email.utils.make_msgid: specify domain                         2010-06-13
       http://bugs.python.org/issue8989    created  avbidder at fortytwo.ch                 
       patch                                                                   

array constructor and array.fromstring should accept bytearray 2010-06-13
       http://bugs.python.org/issue8990    created  tjollans                             
       patch                                                                   

PyArg_Parse*() functions: reject discontinious buffers         2010-06-13
       http://bugs.python.org/issue8991    created  haypo                                
       patch                                                                   

convertsimple() doesn't need to call converterr() if an except 2010-06-13
       http://bugs.python.org/issue8992    created  haypo                                
       patch                                                                   

Small typo in docs for PySys_SetArgv                           2010-06-14
CLOSED http://bugs.python.org/issue8993    created  flashk                               
       patch                                                                   

pydoc does not support non-ascii docstrings                    2010-06-14
       http://bugs.python.org/issue8994    created  torsten                              
                                                                               

Performance issue with multiprocessing queue (3.1 VS 2.6)      2010-06-14
       http://bugs.python.org/issue8995    created  bob                                  
                                                                               

Add a default role to allow writing bare `len` instead of :fun 2010-06-14
       http://bugs.python.org/issue8996    created  merwok                               
                                                                               

Write documentation for codecs.readbuffer_encode()             2010-06-14
       http://bugs.python.org/issue8997    created  lemburg                              
                                                                               

add crypto routines to stdlib                                  2010-06-14
       http://bugs.python.org/issue8998    created  debatem1                             
                                                                               

Add Mercurial support to patchcheck                            2010-06-15
       http://bugs.python.org/issue8999    created  merwok                               
       patch                                                                   

Provide parseable repr to datetime.timezone                    2010-06-15
       http://bugs.python.org/issue9000    created  belopolsky                           
       easy                                                                    

PyFile_FromFd wrong documentation                              2010-06-15
CLOSED http://bugs.python.org/issue9001    created  trovao                               
       patch                                                                   

Add a pointer on where to find a better description of PyFile_ 2010-06-15
CLOSED http://bugs.python.org/issue9002    created  trovao                               
       patch                                                                   

urllib about https behavior                                    2010-06-16
       http://bugs.python.org/issue9003    created  debatem1                             
                                                                               

datetime.utctimetuple() should not set tm_isdst flag to 0      2010-06-16
       http://bugs.python.org/issue9004    created  belopolsky                           
                                                                               

Year range in timetuple                                        2010-06-16
       http://bugs.python.org/issue9005    created  belopolsky                           
                                                                               

xml-rpc Server object does not propagate the encoding to Unmar 2010-06-16
       http://bugs.python.org/issue9006    created  Timoth??e.CEZARD                     
                                                                               

CGIHTTPServer supports only Python CGI scripts                 2010-06-16
       http://bugs.python.org/issue9007    created  techtonik                            
                                                                               

CGIHTTPServer support for arbitrary CGI scripts                2010-06-16
       http://bugs.python.org/issue9008    created  techtonik                            
                                                                               

Improve quality of Python/dtoa.c                               2010-06-16
       http://bugs.python.org/issue9009    created  mark.dickinson                       
       patch                                                                   

Infinite loop in imaplib.IMAP4_SSL when used with Gmail        2010-06-16
CLOSED http://bugs.python.org/issue9010    created  Ruben.Bakker                         
                                                                               

ast_for_factor unary minus optimization changes AST            2010-06-16
       http://bugs.python.org/issue9011    created  alexhsamuel                          
       patch                                                                   

Separate compilation of time and datetime modules              2010-06-16
CLOSED http://bugs.python.org/issue9012    reopened haypo                                
       patch                                                                   

Implement tzinfo.dst() method in timezone                      2010-06-16
       http://bugs.python.org/issue9013    created  belopolsky                           
                                                                               

Incorrect documentation of the PyObject_HEAD macro             2010-06-16
       http://bugs.python.org/issue9014    created  trovao                               
                                                                               

array.array.tofile cannot write arrays of sizes > 4GB, even co 2010-06-16
       http://bugs.python.org/issue9015    created  Bill.Steinmetz                       
                                                                               

IDLE won't launch (Win XP)                                     2010-06-17
       http://bugs.python.org/issue9016    created  jonseger                             
                                                                               

doctest option flag to enable/disable some chunk of doctests?  2010-06-17
       http://bugs.python.org/issue9017    created  harobed                              
                                                                               

os.path.normcase(None) does not raise an error on linux and sh 2010-06-17
       http://bugs.python.org/issue9018    created  r.david.murray                       
       easy                                                                    

wsgiref.headers.Header() does not update headers list it was c 2010-06-17
       http://bugs.python.org/issue9019    created  Marcel.Hellkamp                      
                                                                               

2.7: eval hangs on AIX                                         2010-06-17
       http://bugs.python.org/issue9020    created  srid                                 
                                                                               

no copy.copy problem description                               2010-06-17
       http://bugs.python.org/issue9021    created  techtonik                            
                                                                               

TypeError in wsgiref.handlers when using CGIHandler            2010-06-18
       http://bugs.python.org/issue9022    created  toxicdav3                            
                                                                               

distutils relative path errors                                 2010-06-18
       http://bugs.python.org/issue9023    created  ghazel                               
                                                                               

PyDateTime_IMPORT macro incorrectly marked up                  2010-06-18
       http://bugs.python.org/issue9024    created  tim.golden                           
       patch                                                                   

Non-uniformity in randrange for large arguments.               2010-06-18
       http://bugs.python.org/issue9025    created  mark.dickinson                       
       patch                                                                   

[argparse] Subcommands not printed in the same order they were 2010-06-18
       http://bugs.python.org/issue9026    created  jcollado                             
       patch                                                                   

add test_support.run_unittest command line options and argumen 2010-06-18
       http://bugs.python.org/issue9027    created  techtonik                            
                                                                               

test_support.run_unittest cmdline options and arguments        2010-06-18
CLOSED http://bugs.python.org/issue9028    created  techtonik                            
                                                                               

Issues Now Closed (53)
______________________

New style vs. old style classes __ror__()  operator overloadin  854 days
       http://bugs.python.org/issue2102    tjreedy                              
                                                                               

Backport 3.0 struct module changes to 2.6                       818 days
       http://bugs.python.org/issue2397    mark.dickinson                       
                                                                               

confusing action of struct.pack and struct.unpack with fmt 'p'  748 days
       http://bugs.python.org/issue2981    mark.dickinson                       
                                                                               

struct allows repeat spec. without a format specifier             3 days
       http://bugs.python.org/issue3129    belopolsky                           
       patch                                                                   

Python doesn't handle SIGINT well if it arrives during interpr  726 days
       http://bugs.python.org/issue3137    haypo                                
       patch                                                                   

[patch] allow mmap take file offset as argument                 651 days
       http://bugs.python.org/issue3765    tjreedy                              
                                                                               

warning: unknown conversion type character `z' in format        574 days
       http://bugs.python.org/issue4370    mark.dickinson                       
       patch                                                                   

Incorrect docstring of os.setpgrp                               566 days
       http://bugs.python.org/issue4452    orsenthil                            
                                                                               

datetime lacks concrete tzinfo implementation for UTC             0 days
       http://bugs.python.org/issue5094    belopolsky                           
       patch                                                                   

os.makedirs' mode argument has bad default value                488 days
       http://bugs.python.org/issue5220    smyrman                              
                                                                               

msgfmt.py does not work with plural form                        459 days
       http://bugs.python.org/issue5464    merwok                               
                                                                               

email feedparser.py CRLFLF bug: $ vs \Z                         443 days
       http://bugs.python.org/issue5610    r.david.murray                       
       patch                                                                   

traceback presented in wrong encoding                           331 days
       http://bugs.python.org/issue6543    haypo                                
       patch, needs review                                                     

Document 2.x -> 3.x round changes in "What's New" documents.    225 days
       http://bugs.python.org/issue7261    mark.dickinson                       
       patch                                                                   

PyDateTime_IMPORT() causes compiler warnings                    185 days
       http://bugs.python.org/issue7463    belopolsky                           
                                                                               

logger.StreamHandler emit encoding fallback is wrong            184 days
       http://bugs.python.org/issue7470    merwok                               
       patch                                                                   

IGNORE_EXCEPTION_DETAIL should ignore the module name           181 days
       http://bugs.python.org/issue7490    ncoghlan                             
       patch                                                                   

IDLE about dialog credits raises UnicodeDecodeError              87 days
       http://bugs.python.org/issue8203    haypo                                
       patch                                                                   

getargs.c in Python3 contains some TODO and the documentation    82 days
       http://bugs.python.org/issue8215    haypo                                
       patch                                                                   

Add Misc/maintainers.rst to 2.x branch                           68 days
       http://bugs.python.org/issue8362    techtonik                            
       patch, needs review                                                     

Broken zipfile with  python 3.2 on osx                           58 days
       http://bugs.python.org/issue8442    ronaldoussoren                       
                                                                               

struct - please make sizes explicit                               2 days
       http://bugs.python.org/issue8469    mark.dickinson                       
       patch                                                                   

'y' does not check for embedded NUL bytes                        43 days
       http://bugs.python.org/issue8592    haypo                                
       patch                                                                   

undo findsource regression/change                                33 days
       http://bugs.python.org/issue8720    r.david.murray                       
       patch                                                                   

tarfile/Windows: Don't use mbcs as the default encoding          21 days
       http://bugs.python.org/issue8784    haypo                                
       patch                                                                   

Remove	codecs.readbuffer_encode()	and	codecs.charbuffer_encode   18 days
       http://bugs.python.org/issue8838    haypo                                
                                                                               

Add module level now() and today() functions to datetime modul   10 days
       http://bugs.python.org/issue8903    techtonik                            
                                                                               

quick example how to fix docs                                     7 days
       http://bugs.python.org/issue8904    georg.brandl                         
                                                                               

Error in error message in logging                                 5 days
       http://bugs.python.org/issue8924    merwok                               
                                                                               

Improve c-api/arg.rst: use "bytes" or "str" types instead of "    7 days
       http://bugs.python.org/issue8925    merwok                               
       patch                                                                   

SimpleHTTPServer should contain usage example                    10 days
       http://bugs.python.org/issue8937    techtonik                            
                                                                               

utf-32be codec failing on UCS-2 python build for 32-bit	value     3 days
       http://bugs.python.org/issue8941    pitrou                               
       patch                                                                   

2.7rc1 tarfile.py: `bltn_open(targetpath, "wb")` -> IOError: I    5 days
       http://bugs.python.org/issue8958    srid                                 
                                                                               

2.6 README                                                        2 days
       http://bugs.python.org/issue8960    georg.brandl                         
                                                                               

test_imp fails on OSX when LANG is set                            1 days
       http://bugs.python.org/issue8965    haypo                                
       patch                                                                   

Windows: use (mbcs in) strict mode to encode/decode filenames,    3 days
       http://bugs.python.org/issue8969    haypo                                
       patch                                                                   

Tkinter Litmus Test                                               0 days
       http://bugs.python.org/issue8971    merwok                               
                                                                               

Inconsistent docstrings in struct module                          1 days
       http://bugs.python.org/issue8973    belopolsky                           
       patch                                                                   

Globalize lonely augmented assignment                             1 days
       http://bugs.python.org/issue8977    mark.dickinson                       
       patch                                                                   

OptParse __getitem__                                              1 days
       http://bugs.python.org/issue8979    merwok                               
                                                                               

_struct.__version__ should be string, not bytes                   0 days
       http://bugs.python.org/issue8981    mark.dickinson                       
       easy                                                                    

Python 3 doesn't register script arguments                        0 days
       http://bugs.python.org/issue8984    mark.dickinson                       
                                                                               

String format() has problems parsing numeric indexes              1 days
       http://bugs.python.org/issue8985    eric.smith                           
                                                                               

math.erfc OverflowError                                           0 days
       http://bugs.python.org/issue8986    mark.dickinson                       
       patch                                                                   

Small typo in docs for PySys_SetArgv                              0 days
       http://bugs.python.org/issue8993    georg.brandl                         
       patch                                                                   

PyFile_FromFd wrong documentation                                 0 days
       http://bugs.python.org/issue9001    pitrou                               
       patch                                                                   

Add a pointer on where to find a better description of PyFile_    0 days
       http://bugs.python.org/issue9002    pitrou                               
       patch                                                                   

Infinite loop in imaplib.IMAP4_SSL when used with Gmail           1 days
       http://bugs.python.org/issue9010    r.david.murray                       
                                                                               

Separate compilation of time and datetime modules                 0 days
       http://bugs.python.org/issue9012    haypo                                
       patch                                                                   

test_support.run_unittest cmdline options and arguments           0 days
       http://bugs.python.org/issue9028    r.david.murray                       
                                                                               

Speed up function calls/can add more introspection info        1970 days
       http://bugs.python.org/issue1107887 collinwinter                         
       patch                                                                   

prompt_user_passwd() in FancyURLopener masks 401 Unauthorized  1663 days
       http://bugs.python.org/issue1368368 orsenthil                            
       patch                                                                   

readline problem on ia64-unknown-linux-gnu                     1316 days
       http://bugs.python.org/issue1593035 tjreedy                              
                                                                               

Top Issues Most Discussed (10)
______________________________

 30 add crypto routines to stdlib                                      4 days
open        http://bugs.python.org/issue8998   

 24 datetime lacks concrete tzinfo implementation for UTC              0 days
closed      http://bugs.python.org/issue5094   

 22 Add pure Python implementation of datetime module to	CPython     116 days
open        http://bugs.python.org/issue7989   

 17 `make patchcheck` should check the whitespace of .c/.h files      13 days
open        http://bugs.python.org/issue8912   

 12 Python 3 doesn't register script arguments                         0 days
closed      http://bugs.python.org/issue8984   

 12 Inconsistent docstrings in struct module                           1 days
closed      http://bugs.python.org/issue8973   

 12 sys.argv contains only scriptname                                123 days
open        http://bugs.python.org/issue7936   

 10 Improve quality of Python/dtoa.c                                   2 days
open        http://bugs.python.org/issue9009   

 10 CGIHTTPServer support for arbitrary CGI scripts                    2 days
open        http://bugs.python.org/issue9008   

  9 subprocess.list2cmdline doesn't quote the & character              7 days
pending     http://bugs.python.org/issue8972   


From walter at livinglogic.de  Fri Jun 18 18:32:00 2010
From: walter at livinglogic.de (=?ISO-8859-1?Q?Walter_D=F6rwald?=)
Date: Fri, 18 Jun 2010 18:32:00 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
In-Reply-To: <AANLkTimnL0GfmWULGWfdvxVONdn1MgBh8etl0pChO517@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<AANLkTimnL0GfmWULGWfdvxVONdn1MgBh8etl0pChO517@mail.gmail.com>
Message-ID: <4C1B9F80.6080203@livinglogic.de>

On 18.06.10 17:04, Brian Curtin wrote:

> [...]
>     2. no code coverage (test/user story/rfc/pep)
> 
> 
> If you know of a way to incorporate code coverage tools and metrics into
> the current process, I believe a number of people would be interested.
> There currently exists some coverage tool that runs on the current
> repository, but I'm not sure of its location or status.

   http://coverage.livinglogic.de/

I haven't touched the code in a year, but the job's still running.

> [...]

Servus,
   Walter


From lutz at rmi.net  Fri Jun 18 19:22:10 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Fri, 18 Jun 2010 17:22:10 -0000
Subject: [Python-Dev] email package status in 3.X
Message-ID: <h3sa87mevl05p5ro18062010012216@SMTP>

> Python 3.0 was *declared* to be an experimental release, and by most 
> standards 3.1 (in terms of the core language and functionality) was a 
> solid release.
> 
> Any reasonable expectation about Python 3 adoption predicted that it 
> would take years, and would include going through a phase of difficulty 
> and disappointment...

Declaring something to be a turd doesn't change the fact that
it's a turd.  I have a feeling that most people outside this
list would have much rather avoided the difficulty and 
disappointment altogether.

Let's be honest here; 3.X was released to the community in part 
as an extended beta.  That's not a problem, unless you drop the 
word "beta".  And if you're still not buying that, imagine the sort
of response you'd get if you tried to sell software that billed 
itself as "experimental", and promised a phase of "disappointment".  
Why would you expect the Python world to react any differently?

> Whilst I agree that there are plenty of issues to workon, and I don't 
> underestimate the difficulty of some of them, I think "half-baked" is 
> very much overblown. Whilst you have a lot to say about how much of a 
> problem this is I don't understand what you are suggesting be *done*?

I agree that 3.X isn't all bad, and I very much hope it succeeds.  And 
no, I have no answers; I'm just reporting the perception from downwind.

So here it is: The prevailing view is that 3.X developers hoisted things
on users that they did not fully work through themselves.  Unicode is 
prime among these: for all the talk here about how 2.X was broken in 
this regard, the implications of the 3.X string solution remain to be
fully resolved in the 3.X standard library to this day.  What is a 
common Python user to make of that?

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


From fuzzyman at voidspace.org.uk  Fri Jun 18 19:27:46 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 18 Jun 2010 18:27:46 +0100
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <h3sa87mevl05p5ro18062010012216@SMTP>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
Message-ID: <4C1BAC92.70500@voidspace.org.uk>

On 18/06/2010 18:22, lutz at rmi.net wrote:
>> Python 3.0 was *declared* to be an experimental release, and by most
>> standards 3.1 (in terms of the core language and functionality) was a
>> solid release.
>>
>> Any reasonable expectation about Python 3 adoption predicted that it
>> would take years, and would include going through a phase of difficulty
>> and disappointment...
>>      
> Declaring something to be a turd doesn't change the fact that
> it's a turd.

Right - but *you're* the one calling it a turd, which is not a helpful 
approach or likely to achieve *anything* useful. I still have no idea 
what you are actually suggesting.

> I have a feeling that most people outside this
> list would have much rather avoided the difficulty and
> disappointment altogether.
>
> Let's be honest here; 3.X was released to the community in part
> as an extended beta.

Correction - 3.0 was an experimental release. That is not true of 3.1 
and future releases.

All the best,

Michael
> That's not a problem, unless you drop the
> word "beta".  And if you're still not buying that, imagine the sort
> of response you'd get if you tried to sell software that billed
> itself as "experimental", and promised a phase of "disappointment".
> Why would you expect the Python world to react any differently?
>
>    
>> Whilst I agree that there are plenty of issues to workon, and I don't
>> underestimate the difficulty of some of them, I think "half-baked" is
>> very much overblown. Whilst you have a lot to say about how much of a
>> problem this is I don't understand what you are suggesting be *done*?
>>      
> I agree that 3.X isn't all bad, and I very much hope it succeeds.  And
> no, I have no answers; I'm just reporting the perception from downwind.
>
> So here it is: The prevailing view is that 3.X developers hoisted things
> on users that they did not fully work through themselves.  Unicode is
> prime among these: for all the talk here about how 2.X was broken in
> this regard, the implications of the 3.X string solution remain to be
> fully resolved in the 3.X standard library to this day.  What is a
> common Python user to make of that?
>
> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
>
>
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From janssen at parc.com  Fri Jun 18 19:46:22 2010
From: janssen at parc.com (Bill Janssen)
Date: Fri, 18 Jun 2010 10:46:22 PDT
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTilyEP0DXlbEutiov6AXVcVhZP1Wk6fgwrYn3Mwq@mail.gmail.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>
	<58318.1276798282@parc.com>
	<AANLkTilyEP0DXlbEutiov6AXVcVhZP1Wk6fgwrYn3Mwq@mail.gmail.com>
Message-ID: <60565.1276883182@parc.com>

Giampaolo Rodol? <g.rodola at gmail.com> wrote:

> 2010/6/17 Bill Janssen <janssen at parc.com>:
> 
> > There's a related meta-issue having to do with antique protocols.
> 
> Can I know what meta-issue are you talking about exactly?

Giampaolo, I believe that you and I have already discussed this on one
of the FTP issues.

Bill


From g.rodola at gmail.com  Fri Jun 18 20:23:17 2010
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Fri, 18 Jun 2010 20:23:17 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <60565.1276883182@parc.com>
References: <6wwifklfk7n7tup216062010044853@SMTP>
	<AANLkTim4RjIE-BMNyZVZUFNPsslFvGR4zKv46mRDidXb@mail.gmail.com>
	<58318.1276798282@parc.com>
	<AANLkTilyEP0DXlbEutiov6AXVcVhZP1Wk6fgwrYn3Mwq@mail.gmail.com>
	<60565.1276883182@parc.com>
Message-ID: <AANLkTimBUUvKSBeq-uNPIMDB4pGh65BYa2IExoxryz-n@mail.gmail.com>

2010/6/18 Bill Janssen <janssen at parc.com>:
> Giampaolo Rodol? <g.rodola at gmail.com> wrote:
>
>> 2010/6/17 Bill Janssen <janssen at parc.com>:
>>
>> > There's a related meta-issue having to do with antique protocols.
>>
>> Can I know what meta-issue are you talking about exactly?
>
> Giampaolo, I believe that you and I have already discussed this on one
> of the FTP issues.
>
> Bill

I only remember a discussion in which I was against removing OOB data
support from asyncore in order to support certain parts of the FTP
protocol using it, but that's all.
I don't see how urlib or any other stdlib module is supposed to be
penalized by FTP protocol in any way.

--- Giampaolo
http://code.google.com/p/pyftpdlib
http://code.google.com/p/psutil

From lutz at rmi.net  Fri Jun 18 20:52:45 2010
From: lutz at rmi.net (lutz at rmi.net)
Date: Fri, 18 Jun 2010 18:52:45 -0000
Subject: [Python-Dev] email package status in 3.X
Message-ID: <medczp3dq3tj4doi18062010025250@SMTP>

I wasn't calling Python 3 a turd.  I was trying to show
the strangeness of the logic behind your rationalization.
And failing badly... (maybe I should have used "tar ball"?)

What I'm suggesting is that extreme caution be exercised from
this point forward with all things 3.X-related.  Whether you 
wish to accept this or not, 3.X has a negative image to many.
This suggestion specifically includes not abandoning current 
3.X email package users as a case in point.  Ripping the rug 
out from new 3.X users after they took the time to port seems
like it may be just enough to tip the scales altogether.

--Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)


> -----Original Message-----
> From: Michael Foord <fuzzyman at voidspace.org.uk>
> To: lutz at rmi.net
> Subject: Re: [Python-Dev] email package status in 3.X
> Date: Fri, 18 Jun 2010 18:27:46 +0100
> 
> On 18/06/2010 18:22, lutz at rmi.net wrote:
> >> Python 3.0 was *declared* to be an experimental release, and by most
> >> standards 3.1 (in terms of the core language and functionality) was a
> >> solid release.
> >>
> >> Any reasonable expectation about Python 3 adoption predicted that it
> >> would take years, and would include going through a phase of difficulty
> >> and disappointment...
> >>      
> > Declaring something to be a turd doesn't change the fact that
> > it's a turd.
> 
> Right - but *you're* the one calling it a turd, which is not a helpful 
> approach or likely to achieve *anything* useful. I still have no idea 
> what you are actually suggesting.
> 
> > I have a feeling that most people outside this
> > list would have much rather avoided the difficulty and
> > disappointment altogether.
> >
> > Let's be honest here; 3.X was released to the community in part
> > as an extended beta.
> 
> Correction - 3.0 was an experimental release. That is not true of 3.1 
> and future releases.
> 
> All the best,
> 
> Michael
> > That's not a problem, unless you drop the
> > word "beta".  And if you're still not buying that, imagine the sort
> > of response you'd get if you tried to sell software that billed
> > itself as "experimental", and promised a phase of "disappointment".
> > Why would you expect the Python world to react any differently?
> >
> >    
> >> Whilst I agree that there are plenty of issues to workon, and I don't
> >> underestimate the difficulty of some of them, I think "half-baked" is
> >> very much overblown. Whilst you have a lot to say about how much of a
> >> problem this is I don't understand what you are suggesting be *done*?
> >>      
> > I agree that 3.X isn't all bad, and I very much hope it succeeds.  And
> > no, I have no answers; I'm just reporting the perception from downwind.
> >
> > So here it is: The prevailing view is that 3.X developers hoisted things
> > on users that they did not fully work through themselves.  Unicode is
> > prime among these: for all the talk here about how 2.X was broken in
> > this regard, the implications of the 3.X string solution remain to be
> > fully resolved in the 3.X standard library to this day.  What is a
> > common Python user to make of that?
> >
> > --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
> >
> >
> >    
> 
> 
> -- 
> http://www.ironpythoninaction.com/
> http://www.voidspace.org.uk/blog
> 
> READ CAREFULLY. By accepting and reading this email you agree, on behalf of 
> your employer, to release me from all obligations and waivers arising from 
> any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap,
>  clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and 
> acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with 
> your employer, its partners, licensors, agents and assigns, in perpetuity, 
> without prejudice to my ongoing rights and privileges. You further represent 
> that you have the authority to release me from any BOGUS AGREEMENTS on behalf 
> of your employer.
> 
> 
> 

From pje at telecommunity.com  Fri Jun 18 22:48:21 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 18 Jun 2010 16:48:21 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <h3sa87mevl05p5ro18062010012216@SMTP>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
Message-ID: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>

At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>So here it is: The prevailing view is that 3.X developers hoisted things
>on users that they did not fully work through themselves.  Unicode is
>prime among these: for all the talk here about how 2.X was broken in
>this regard, the implications of the 3.X string solution remain to be
>fully resolved in the 3.X standard library to this day.  What is a
>common Python user to make of that?

Certainly, this was my impression as well, after all the Web-SIG 
discussions regarding the state of the stdlib in 3.x with respect to 
URL parsing, joining, opening, etc.

To be honest, I'm waiting to see some sort of tutorial(s) for using 
3.x that actually addresses these kinds of stdlib usage issues, so 
that I don't have to think about it or futz around with 
experimenting, possibly to find that some things can't be done at all.

IOW, 3.x has broken TOOOWTDI for me in some areas.  There may be 
obvious ways to do it, but, as per the Zen of Python, "that way may 
not be obvious at first unless you're Dutch".  ;-)
Since at the moment Python 3 offers me only cosmetic improvements 
over 2.x (apart from argument annotations), it's hard to get excited 
enough about it to want to muck about with porting anything to it, or 
even trying to learn about all the ramifications of the changes.  :-(


From tjreedy at udel.edu  Fri Jun 18 22:53:42 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 18 Jun 2010 16:53:42 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <4C1B9F80.6080203@livinglogic.de>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>	<AANLkTimnL0GfmWULGWfdvxVONdn1MgBh8etl0pChO517@mail.gmail.com>
	<4C1B9F80.6080203@livinglogic.de>
Message-ID: <hvgmcn$7kn$1@dough.gmane.org>

On 6/18/2010 12:32 PM, Walter D?rwald wrote:

>     http://coverage.livinglogic.de/

I am a bit puzzled as to the meaning of the gray/red/green bars since 
the correlation between coverage % and bars is not very high.


From jnoller at gmail.com  Fri Jun 18 23:02:09 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 18 Jun 2010 17:02:09 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
Message-ID: <AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>

On Fri, Jun 18, 2010 at 4:48 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>>
>> So here it is: The prevailing view is that 3.X developers hoisted things
>> on users that they did not fully work through themselves. ?Unicode is
>> prime among these: for all the talk here about how 2.X was broken in
>> this regard, the implications of the 3.X string solution remain to be
>> fully resolved in the 3.X standard library to this day. ?What is a
>> common Python user to make of that?
>
> Certainly, this was my impression as well, after all the Web-SIG discussions
> regarding the state of the stdlib in 3.x with respect to URL parsing,
> joining, opening, etc.

Nothing is set in stone; if something is incredibly painful, or worse
yet broken, then someone needs to file a bug, bring it to this list,
or bring up a patch. This is code we're talking about - nothing is set
in stone, and if something is criminally broken it needs to be first
identified, and then fixed.

> To be honest, I'm waiting to see some sort of tutorial(s) for using 3.x that
> actually addresses these kinds of stdlib usage issues, so that I don't have
> to think about it or futz around with experimenting, possibly to find that
> some things can't be done at all.

I guess tutorial welcome, rather than patch welcome then ;)

> IOW, 3.x has broken TOOOWTDI for me in some areas. ?There may be obvious
> ways to do it, but, as per the Zen of Python, "that way may not be obvious
> at first unless you're Dutch". ?;-)

What areas. We need specifics which can either be:

1> Shot down.
2> Turned into bugs, so they can be fixed
3> Documented in the core documentation.

jesse

From brett at python.org  Fri Jun 18 23:09:11 2010
From: brett at python.org (Brett Cannon)
Date: Fri, 18 Jun 2010 14:09:11 -0700
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <hvgmcn$7kn$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com> 
	<AANLkTimnL0GfmWULGWfdvxVONdn1MgBh8etl0pChO517@mail.gmail.com> 
	<4C1B9F80.6080203@livinglogic.de> <hvgmcn$7kn$1@dough.gmane.org>
Message-ID: <AANLkTin9MTbfRX1p34p5uV7hNKbK9Ban6uCfiSRIRH8N@mail.gmail.com>

On Fri, Jun 18, 2010 at 13:53, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/18/2010 12:32 PM, Walter D?rwald wrote:
>
>> ? ?http://coverage.livinglogic.de/
>
> I am a bit puzzled as to the meaning of the gray/red/green bars since the
> correlation between coverage % and bars is not very high.

Gray is lines that are unexecutable (comments, etc.), green are lines
that were executed, and red is lines not executed.

From fuzzyman at voidspace.org.uk  Sat Jun 19 00:08:32 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 18 Jun 2010 23:08:32 +0100
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <medczp3dq3tj4doi18062010025250@SMTP>
References: <medczp3dq3tj4doi18062010025250@SMTP>
Message-ID: <4C1BEE60.4040508@voidspace.org.uk>

On 18/06/2010 19:52, lutz at rmi.net wrote:
> I wasn't calling Python 3 a turd.  I was trying to show
> the strangeness of the logic behind your rationalization.
> And failing badly... (maybe I should have used "tar ball"?)
>
>    

I didn't make myself clear. The expected disappointment I was referring 
to was about the rate of adoption, not about the quality of the product.

I'm still baffled as to how a bug in the cgi module (along with the 
acknowledged email problems) is such a big deal. Was it reported and 
then languished in the bug tracker? That would be bad ion its own but if 
it was only recently discovered that indicates that it probably isn't 
such a big deal - either way it needs fixing, but using Python for 
writing cgis hasn't been a big use case for a long time.

All the best,

Michael

> What I'm suggesting is that extreme caution be exercised from
> this point forward with all things 3.X-related.  Whether you
> wish to accept this or not, 3.X has a negative image to many.
> This suggestion specifically includes not abandoning current
> 3.X email package users as a case in point.  Ripping the rug
> out from new 3.X users after they took the time to port seems
> like it may be just enough to tip the scales altogether.
>
> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
>
>
>    
>> -----Original Message-----
>> From: Michael Foord<fuzzyman at voidspace.org.uk>
>> To: lutz at rmi.net
>> Subject: Re: [Python-Dev] email package status in 3.X
>> Date: Fri, 18 Jun 2010 18:27:46 +0100
>>
>> On 18/06/2010 18:22, lutz at rmi.net wrote:
>>      
>>>> Python 3.0 was *declared* to be an experimental release, and by most
>>>> standards 3.1 (in terms of the core language and functionality) was a
>>>> solid release.
>>>>
>>>> Any reasonable expectation about Python 3 adoption predicted that it
>>>> would take years, and would include going through a phase of difficulty
>>>> and disappointment...
>>>>
>>>>          
>>> Declaring something to be a turd doesn't change the fact that
>>> it's a turd.
>>>        
>> Right - but *you're* the one calling it a turd, which is not a helpful
>> approach or likely to achieve *anything* useful. I still have no idea
>> what you are actually suggesting.
>>
>>      
>>> I have a feeling that most people outside this
>>> list would have much rather avoided the difficulty and
>>> disappointment altogether.
>>>
>>> Let's be honest here; 3.X was released to the community in part
>>> as an extended beta.
>>>        
>> Correction - 3.0 was an experimental release. That is not true of 3.1
>> and future releases.
>>
>> All the best,
>>
>> Michael
>>      
>>> That's not a problem, unless you drop the
>>> word "beta".  And if you're still not buying that, imagine the sort
>>> of response you'd get if you tried to sell software that billed
>>> itself as "experimental", and promised a phase of "disappointment".
>>> Why would you expect the Python world to react any differently?
>>>
>>>
>>>        
>>>> Whilst I agree that there are plenty of issues to workon, and I don't
>>>> underestimate the difficulty of some of them, I think "half-baked" is
>>>> very much overblown. Whilst you have a lot to say about how much of a
>>>> problem this is I don't understand what you are suggesting be *done*?
>>>>
>>>>          
>>> I agree that 3.X isn't all bad, and I very much hope it succeeds.  And
>>> no, I have no answers; I'm just reporting the perception from downwind.
>>>
>>> So here it is: The prevailing view is that 3.X developers hoisted things
>>> on users that they did not fully work through themselves.  Unicode is
>>> prime among these: for all the talk here about how 2.X was broken in
>>> this regard, the implications of the 3.X string solution remain to be
>>> fully resolved in the 3.X standard library to this day.  What is a
>>> common Python user to make of that?
>>>
>>> --Mark Lutz  (http://learning-python.com, http://rmi.net/~lutz)
>>>
>>>
>>>
>>>        
>>
>> -- 
>> http://www.ironpythoninaction.com/
>> http://www.voidspace.org.uk/blog
>>
>> READ CAREFULLY. By accepting and reading this email you agree, on behalf of
>> your employer, to release me from all obligations and waivers arising from
>> any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap,
>>   clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and
>> acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with
>> your employer, its partners, licensors, agents and assigns, in perpetuity,
>> without prejudice to my ongoing rights and privileges. You further represent
>> that you have the authority to release me from any BOGUS AGREEMENTS on behalf
>> of your employer.
>>
>>
>>
>>      


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From tjreedy at udel.edu  Sat Jun 19 00:08:19 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 18 Jun 2010 18:08:19 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTik2tQbDxXFWsKGLbWdTY8Y8qHr5ofjR0Z5crYA4@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>	<AANLkTikD3zOmYZ1zHDJo6m7Ifu2txeTDA_FfOX1AkgGH@mail.gmail.com>	<201006190009.46122.steve@pearwood.info>
	<AANLkTik2tQbDxXFWsKGLbWdTY8Y8qHr5ofjR0Z5crYA4@mail.gmail.com>
Message-ID: <hvgqol$tuf$1@dough.gmane.org>

On 6/18/2010 10:24 AM, Jesse Noller wrote:

> http://jessenoller.com/2010/05/20/announcing-python-sprint-sponsorship/

This does not specify what expenses you are thinking of covering. Food 
is the most obvious.

Anyway, this got me to think about offering my house at a site for US 
east coast mid-atlantic sprints (near I95, halfway betweenn NY and WDC, 
FIOS internet, TV/Playstation/Netflix for breaks ;-).

Terry Jan Reedy


From nyamatongwe at gmail.com  Sat Jun 19 00:31:40 2010
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Sat, 19 Jun 2010 08:31:40 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C1B913D.60401@voidspace.org.uk>
References: <cvsjrr4t84x35d3418062010110947@SMTP>
	<4C1B913D.60401@voidspace.org.uk>
Message-ID: <AANLkTimhJqiLdapFKKOD9OT9YBWC0BYNnS2D_si8ruDV@mail.gmail.com>

Michael Foord:

> Python 3.0 was *declared* to be an experimental release, and by most
> standards 3.1 (in terms of the core language and functionality) was a solid
> release.

   That looks to me like an after-the-event rationalization. The
release note for Python 3.0 (and the "What's new") gives no indication
that it is experimental but does say """
We are confident that Python 3.0 is of the same high quality as our
previous releases ...
you can safely choose either version (or both) to use in your projects. """
http://mail.python.org/pipermail/python-dev/2008-December/083824.html

   Neil

From jnoller at gmail.com  Sat Jun 19 00:37:56 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Fri, 18 Jun 2010 18:37:56 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <hvgqol$tuf$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<AANLkTikD3zOmYZ1zHDJo6m7Ifu2txeTDA_FfOX1AkgGH@mail.gmail.com>
	<201006190009.46122.steve@pearwood.info>
	<AANLkTik2tQbDxXFWsKGLbWdTY8Y8qHr5ofjR0Z5crYA4@mail.gmail.com>
	<hvgqol$tuf$1@dough.gmane.org>
Message-ID: <AANLkTin4phKiuy-Ux0c3CiW2Ck3r2gycdUwYfRDY9B8n@mail.gmail.com>

On Fri, Jun 18, 2010 at 6:08 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/18/2010 10:24 AM, Jesse Noller wrote:
>
>> http://jessenoller.com/2010/05/20/announcing-python-sprint-sponsorship/
>
> This does not specify what expenses you are thinking of covering. Food is
> the most obvious.
>
> Anyway, this got me to think about offering my house at a site for US east
> coast mid-atlantic sprints (near I95, halfway betweenn NY and WDC, FIOS
> internet, TV/Playstation/Netflix for breaks ;-).
>
> Terry Jan Reedy

Yup, I'm putting the site together now - essentially what's covered is
"anything up to this amount" - meaning, if you spend 200$ on room
space, then this could go to that. Or 200$ in food for 20 people, etc.
We'll have basic guidelines.

jesse

From raymond.hettinger at gmail.com  Sat Jun 19 00:51:10 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Fri, 18 Jun 2010 15:51:10 -0700
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C1BEE60.4040508@voidspace.org.uk>
References: <medczp3dq3tj4doi18062010025250@SMTP>
	<4C1BEE60.4040508@voidspace.org.uk>
Message-ID: <606BEE82-71C7-4033-A273-F9E945ACDF90@gmail.com>


On Jun 18, 2010, at 3:08 PM, Michael Foord wrote:

> I'm still baffled as to how a bug in the cgi module (along with the acknowledged email problems) is such a big deal. Was it reported and then languished in the bug tracker? That would be bad ion its own but if it was only recently discovered that indicates that it probably isn't such a big deal - either way it needs fixing, but using Python for writing cgis hasn't been a big use case for a long time.

That's one possible explanation.  Another possible explanation is the product isn't being heavily exercised for serious work and that it has yet to be shaken-out thoroughly.   There has been a disappointing lack of bug reports across the board for 3.x.  That doesn't mean that the bugs aren't there and that they won't be reported when adoption is heavier.

In the cases of email, mime handling, cgi and whatnot, the important point is not whether a given technology is popular.  The important part is that it hints at the kind of bytes/text issues that people are going to face and that we will need to help them address (i.e. such as blobs containing multiple encodings, a need to use byte oriented tools such as md5 in conjunction with text oriented applications, etc.)

One other thought:  In addition to not getting many 3.x specific bug reports, we don't seem to be getting many  3.x specific help questions (i.e. asking about dictviews or how to make a priority queue in a environment where many callable don't support ordering operations, etc.). 


> Mark Lutz wrote

> What I'm suggesting is that extreme caution be exercised from
> this point forward with all things 3.X-related.  Whether you
> wish to accept this or not, 3.X has a negative image to many.
> This suggestion specifically includes not abandoning current
> 3.X email package users as a case in point.  Ripping the rug
> out from new 3.X users after they took the time to port seems
> like it may be just enough to tip the scales altogether.

A couple other areas that need work (some of them are minor):

* BeautifulSoup was left behind when SGML parsing was removed from the standard lib.
* Shelves were crippled for Windows users when bsddb was ripped out.
* Lists containing None for missing values are no longer sortable.
* The basic heapq approach to making a priority queue not longer works well.
   Simply decorating with (priority_level, callable_or_object) fails with two tasks at the
   same priority if the callable or other objects aren't orderable.


Raymond

P.S.  I do think it would be great if we could direct some attention
to parts of 3.x that are really nice.  Am hoping that this conversation
doesn't drown in negativity.   Instead, it should focus on what 
improvements are needed to win broader adoption.


From fuzzyman at voidspace.org.uk  Sat Jun 19 00:56:40 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 18 Jun 2010 23:56:40 +0100
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <606BEE82-71C7-4033-A273-F9E945ACDF90@gmail.com>
References: <medczp3dq3tj4doi18062010025250@SMTP>
	<4C1BEE60.4040508@voidspace.org.uk>
	<606BEE82-71C7-4033-A273-F9E945ACDF90@gmail.com>
Message-ID: <4C1BF9A8.7030208@voidspace.org.uk>

On 18/06/2010 23:51, Raymond Hettinger wrote:
> On Jun 18, 2010, at 3:08 PM, Michael Foord wrote:
>
>    
>> I'm still baffled as to how a bug in the cgi module (along with the acknowledged email problems) is such a big deal. Was it reported and then languished in the bug tracker? That would be bad ion its own but if it was only recently discovered that indicates that it probably isn't such a big deal - either way it needs fixing, but using Python for writing cgis hasn't been a big use case for a long time.
>>      
> That's one possible explanation.  Another possible explanation is the product isn't being heavily exercised for serious work and that it has yet to be shaken-out thoroughly.   There has been a disappointing lack of bug reports across the board for 3.x.  That doesn't mean that the bugs aren't there and that they won't be reported when adoption is heavier.
>
>    

Oh, I quite agree. I don't think it makes py3k a turd either.

> In the cases of email, mime handling, cgi and whatnot, the important point is not whether a given technology is popular.  The important part is that it hints at the kind of bytes/text issues that people are going to face and that we will need to help them address (i.e. such as blobs containing multiple encodings, a need to use byte oriented tools such as md5 in conjunction with text oriented applications, etc.)
>
> One other thought:  In addition to not getting many 3.x specific bug reports, we don't seem to be getting many  3.x specific help questions (i.e. asking about dictviews or how to make a priority queue in a environment where many callable don't support ordering operations, etc.).
>
>    

Most of the questions I've seen about Python 3 are from library authors 
doing porting rather than application developers. This is to be expected 
I guess.


>    
>> Mark Lutz wrote
>>      
>    
>> What I'm suggesting is that extreme caution be exercised from
>> this point forward with all things 3.X-related.  Whether you
>> wish to accept this or not, 3.X has a negative image to many.
>> This suggestion specifically includes not abandoning current
>> 3.X email package users as a case in point.  Ripping the rug
>> out from new 3.X users after they took the time to port seems
>> like it may be just enough to tip the scales altogether.
>>      
> A couple other areas that need work (some of them are minor):
>
> * BeautifulSoup was left behind when SGML parsing was removed from the standard lib.
> * Shelves were crippled for Windows users when bsddb was ripped out.
> * Lists containing None for missing values are no longer sortable.
>    

Yeah, this one can be a bugger. :-)

> * The basic heapq approach to making a priority queue not longer works well.
>     Simply decorating with (priority_level, callable_or_object) fails with two tasks at the
>     same priority if the callable or other objects aren't orderable.
>
>
> Raymond
>
> P.S.  I do think it would be great if we could direct some attention
> to parts of 3.x that are really nice.  Am hoping that this conversation
> doesn't drown in negativity.   Instead, it should focus on what
> improvements are needed to win broader adoption.
>
>
>    

I definitely agree that our focus should be on fixing problems as we 
find them and working on increasing adoption. No argument from me.

All the best,

Michael


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From tjreedy at udel.edu  Sat Jun 19 04:39:36 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 18 Jun 2010 22:39:36 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <606BEE82-71C7-4033-A273-F9E945ACDF90@gmail.com>
References: <medczp3dq3tj4doi18062010025250@SMTP>	<4C1BEE60.4040508@voidspace.org.uk>
	<606BEE82-71C7-4033-A273-F9E945ACDF90@gmail.com>
Message-ID: <hvhal8$390$2@dough.gmane.org>

On 6/18/2010 6:51 PM, Raymond Hettinger wrote:
> There has been a disappointing
> lack of bug reports across the board for 3.x.

Here is one from this week involving the interaction of array and 
bytearray. It needs a comment from someone who can understand the C-API 
based patch, which is beyond me.
http://bugs.python.org/issue8990

Another possible reason for the lack: 500 of the current 2800 open 
issues have NO comment (ie, message count = 1), some with patches.
I just posted '500 tracker orphans; we need more reviewers' on 
python-list to encourage more participation.

Terry Jan Reedy


From walter at livinglogic.de  Sat Jun 19 11:57:35 2010
From: walter at livinglogic.de (=?utf-8?Q?Walter_D=C3=B6rwald?=)
Date: Sat, 19 Jun 2010 11:57:35 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <hvgmcn$7kn$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>	<AANLkTimnL0GfmWULGWfdvxVONdn1MgBh8etl0pChO517@mail.gmail.com>
	<4C1B9F80.6080203@livinglogic.de> <hvgmcn$7kn$1@dough.gmane.org>
Message-ID: <C7103F9E-420C-422E-9B75-C2C7BC88F83A@livinglogic.de>

Am 18.06.2010 um 22:53 schrieb Terry Reedy <tjreedy at udel.edu>:

> On 6/18/2010 12:32 PM, Walter D?rwald wrote:
>
>>    http://coverage.livinglogic.de/
>
> I am a bit puzzled as to the meaning of the gray/red/green bars  
> since the correlation between coverage % and bars is not very high.

The gray bar is the uncoverable part of the source (empty lines,  
comments etc.), the green bar is the covered part (i.e. those lines  
that really got executed) and the red bar is the uncovered part (i.e.  
Those lines that could have been executed but weren't). So coverage is

    green / (green + red)

Just click on the coverage header to sort by coverage and you *will*  
see a correlation.

Servus,
    Walter


From arcriley at gmail.com  Sat Jun 19 12:59:44 2010
From: arcriley at gmail.com (Arc Riley)
Date: Sat, 19 Jun 2010 06:59:44 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <20100618050712.GC20639@thorne.id.au>
References: <20100618050712.GC20639@thorne.id.au>
Message-ID: <AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>

You mean Twisted support, because library support is at the point where
there are fewer actively maintained packages not yet ported than those which
are.  Of course if your Python experience is hyper-focused to one framework
that isn't ported yet, it will certainly seem like a lot, and you guys who
run #Python are clearly hyper-focused on Twisted.

Great example of the current state: about an hour ago I needed an inotify
Python package for a Py3 project.  I googled for "Python inotify", found
pyinotify, saw that they have several recent releases but no mention of Py3,
typed "sudo emerge -av pyinotify", and it installed pyinotify for Python
2.6, 3.1, and 3.2_pre at the same time.  Run python interactively, imports
and works great.

Portage (Gentoo's package system, emerge being the primary command) is
Python based and fully ported to Python 3.  Most of my workstations and
production servers report "/usr/bin/python --version" as "Python 3.1.2"
(Python 2.6 is /usr/bin/python2), my Apache's mod_wsgi is compiled for
Python 3 and save for a few Django and Trac sites (fastcgi) all of my
Python-based webapps run on it. CherryPy and SQLAlchemy have had Py3 support
for some time.

I can name in a short list the legacy Python packages I use:

   - Django
   - Trac
   - Mercurial (they have a Summer of Code student working to port it now)
   - PIL (apparently will have a Python 3 release out soon)
   - pygtk (Python 3 support planned for Gnome 3 in a few months)
   - xmpppy

The list of Python 3 packages I use regularly is at least 50 names long and
I have only contributed to porting a dozen or so of those.

This anti-Py3 rhetoric is damaging to the community and needs to stop.
We're moving forward toward Python 3.2 and beyond, complaining about it only
saps valuable developer time (including your own) from getting these
libraries you need ported faster.


On Fri, Jun 18, 2010 at 1:07 AM, Stephen Thorne <stephen at thorne.id.au>wrote:

>
> Yes, #python keeps the text "It's too early to use Python 3.x" in its
> topic.
> Library support is the only reason.
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/0ba6d421/attachment.html>

From breamoreboy at yahoo.co.uk  Sat Jun 19 13:20:01 2010
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Sat, 19 Jun 2010 12:20:01 +0100
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
Message-ID: <hvi93i$c4j$1@dough.gmane.org>

On 19/06/2010 11:59, Arc Riley wrote:
> You mean Twisted support, because library support is at the point where
> there are fewer actively maintained packages not yet ported than those which
> are.  Of course if your Python experience is hyper-focused to one framework
> that isn't ported yet, it will certainly seem like a lot, and you guys who
> run #Python are clearly hyper-focused on Twisted.
>
> Great example of the current state: about an hour ago I needed an inotify
> Python package for a Py3 project.  I googled for "Python inotify", found
> pyinotify, saw that they have several recent releases but no mention of Py3,
> typed "sudo emerge -av pyinotify", and it installed pyinotify for Python
> 2.6, 3.1, and 3.2_pre at the same time.  Run python interactively, imports
> and works great.
>
> Portage (Gentoo's package system, emerge being the primary command) is
> Python based and fully ported to Python 3.  Most of my workstations and
> production servers report "/usr/bin/python --version" as "Python 3.1.2"
> (Python 2.6 is /usr/bin/python2), my Apache's mod_wsgi is compiled for
> Python 3 and save for a few Django and Trac sites (fastcgi) all of my
> Python-based webapps run on it. CherryPy and SQLAlchemy have had Py3 support
> for some time.
>
> I can name in a short list the legacy Python packages I use:
>
>     - Django
>     - Trac
>     - Mercurial (they have a Summer of Code student working to port it now)
>     - PIL (apparently will have a Python 3 release out soon)
>     - pygtk (Python 3 support planned for Gnome 3 in a few months)
>     - xmpppy
>
> The list of Python 3 packages I use regularly is at least 50 names long and
> I have only contributed to porting a dozen or so of those.
>
> This anti-Py3 rhetoric is damaging to the community and needs to stop.
> We're moving forward toward Python 3.2 and beyond, complaining about it only
> saps valuable developer time (including your own) from getting these
> libraries you need ported faster.
>

Fair comment, but how many people are waiting for numpy for Python 3? 
I'd guess that it's many, many thousands, given that there are people 
such as myself who use it indirectly, in my case via matplotlib.  Note 
that I am aware that the numpy Python 3 support is very close to release.

Kindest regards.

Mark Lawrence.


From stephen at xemacs.org  Sat Jun 19 13:34:41 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 19 Jun 2010 20:34:41 +0900
Subject: [Python-Dev] Python Library Support in 3.x (Was: email
	package	status in 3.X)
In-Reply-To: <AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
Message-ID: <87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>

anatoly techtonik writes:

 > I do not know what are you intending to do, but my opinion that
 > fund raising for patching library is a waste of money.

Of course it's not a waste of money.  The need is real, so as long as
the PSF and other organizations (GSoC) choose reasonable projects/
people to support, progress will be steady.  Merely the sense that
real resources are flowing into the stdlib from outside the volunteer
core will encourage more volunteers as well.

 > PSF should concentrate on enhancing tools to make lives of library
 > supporters easier. I do not want to become a maintainer,

Well, the current maintainers, while not yet happy with the state of
the infrastructure, have been steadily engaged in improving it by
adding features that have consensus support.  But getting consensus
support is not easy.

Eg, I thought that with three plausible candidates, of which Mercurial
was obviously satisfactory (although I preferred git, myself, and a at
least couple people advocated Bazaar strongly), a switch to a dVCS was
a no-brainer.  It wasn't.  Several people opposed it strongly until it
became clear that in theory at least it would require *no* changes to
current workflow (although I think most of those developers will find
much to like about the changes Mercurial will bring).  And even now
implementation is hanging up on the requirement that it not affect
Windows-based developers adversely ... and it turns out that even
being Python-based is nowhere near enough to guarantee that, but
rather it requires further effort before that will become reality --
and it's not forthcoming from the Mercurial developers, who
unsurprisingly like Mercurial enough to deal with the minor flaws.

IMO, if you want to improve the infrastructure, you need to work on
getting consensus behind a few of your proposals, rather than making
one after another and not following up with code or a PEP.

From solipsis at pitrou.net  Sat Jun 19 13:51:04 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 19 Jun 2010 13:51:04 +0200
Subject: [Python-Dev] Mercurial
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100619135104.5b0f22ed@pitrou.net>

On Sat, 19 Jun 2010 20:34:41 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> 
> And even now
> implementation is hanging up on the requirement that it not affect
> Windows-based developers adversely ... and it turns out that even
> being Python-based is nowhere near enough to guarantee that, but
> rather it requires further effort before that will become reality --
> and it's not forthcoming from the Mercurial developers, who
> unsurprisingly like Mercurial enough to deal with the minor flaws.

FWIW, the EOL extension is now part of Mercurial:
http://mercurial.selenic.com/wiki/EolExtension


Antoine.


From exarkun at twistedmatrix.com  Sat Jun 19 14:12:56 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Sat, 19 Jun 2010 12:12:56 -0000
Subject: [Python-Dev] Python Library Support in 3.x (Was: email
	package	status in 3.X)
In-Reply-To: <AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
Message-ID: <20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>

On 10:59 am, arcriley at gmail.com wrote:
>You mean Twisted support, because library support is at the point where
>there are fewer actively maintained packages not yet ported than those 
>which
>are.  Of course if your Python experience is hyper-focused to one 
>framework
>that isn't ported yet, it will certainly seem like a lot, and you guys 
>who
>run #Python are clearly hyper-focused on Twisted.

Arc,

This isn't about Twisted.  Let's not waste everyone's time by trying to 
make it into a conflict between Twisted users and the rest of the Python 
community.

You listed six other major packages that you yourself use that aren't 
available on Python 3 yet, so why are you trying to say here that this 
is all about Twisted?
>[snip]
>
>This anti-Py3 rhetoric is damaging to the community and needs to stop.
>We're moving forward toward Python 3.2 and beyond, complaining about it 
>only
>saps valuable developer time (including your own) from getting these
>libraries you need ported faster.

No, it's not damaging.  Critical self-evaluation is a useful tool. 
Trying to silence differing perspectives is what's damaging to the 
community.

Jean-Paul

From orsenthil at gmail.com  Sat Jun 19 14:13:02 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Sat, 19 Jun 2010 17:43:02 +0530
Subject: [Python-Dev] Mercurial
In-Reply-To: <20100619135104.5b0f22ed@pitrou.net>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100619135104.5b0f22ed@pitrou.net>
Message-ID: <20100619121302.GB12233@remy>

On Sat, Jun 19, 2010 at 01:51:04PM +0200, Antoine Pitrou wrote:
> FWIW, the EOL extension is now part of Mercurial:
> http://mercurial.selenic.com/wiki/EolExtension

Should we all move soon now? 
Any target date you have in mind, Antoine?

-- 
Senthil

From barry at python.org  Sat Jun 19 14:33:17 2010
From: barry at python.org (Barry Warsaw)
Date: Sat, 19 Jun 2010 08:33:17 -0400
Subject: [Python-Dev] Mercurial
In-Reply-To: <20100619121302.GB12233@remy>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100619135104.5b0f22ed@pitrou.net> <20100619121302.GB12233@remy>
Message-ID: <20100619083317.11355342@heresy>

On Jun 19, 2010, at 05:43 PM, Senthil Kumaran wrote:

>On Sat, Jun 19, 2010 at 01:51:04PM +0200, Antoine Pitrou wrote:
>> FWIW, the EOL extension is now part of Mercurial:
>> http://mercurial.selenic.com/wiki/EolExtension
>
>Should we all move soon now? 
>Any target date you have in mind, Antoine?

I believe the plan was to migrate right after 2.7 final is released.  I hope
that is still the plan.  Since that is only 2 weeks away, are we ready?

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/a705930d/attachment.pgp>

From solipsis at pitrou.net  Sat Jun 19 14:42:18 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 19 Jun 2010 14:42:18 +0200
Subject: [Python-Dev] Mercurial
In-Reply-To: <20100619121302.GB12233@remy>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100619135104.5b0f22ed@pitrou.net> <20100619121302.GB12233@remy>
Message-ID: <20100619144218.4209e881@pitrou.net>

On Sat, 19 Jun 2010 17:43:02 +0530
Senthil Kumaran <orsenthil at gmail.com> wrote:
> On Sat, Jun 19, 2010 at 01:51:04PM +0200, Antoine Pitrou wrote:
> > FWIW, the EOL extension is now part of Mercurial:
> > http://mercurial.selenic.com/wiki/EolExtension
> 
> Should we all move soon now? 
> Any target date you have in mind, Antoine?

I should point out that I am in no way responsible for the migration.
I think Dirkjan and Brett said they would tackle this after the 2.7
release. But they'd better answer by themselves :)

From prologic at shortcircuit.net.au  Sat Jun 19 15:05:37 2010
From: prologic at shortcircuit.net.au (James Mills)
Date: Sat, 19 Jun 2010 23:05:37 +1000
Subject: [Python-Dev] Mercurial
In-Reply-To: <20100619144218.4209e881@pitrou.net>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com> 
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100619135104.5b0f22ed@pitrou.net> 
	<20100619121302.GB12233@remy> <20100619144218.4209e881@pitrou.net>
Message-ID: <AANLkTin1isuBk2QaHdZ1xxEhMPnH6Vs7OxPb8_U9aCCQ@mail.gmail.com>

On Sat, Jun 19, 2010 at 10:42 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> I should point out that I am in no way responsible for the migration.
> I think Dirkjan and Brett said they would tackle this after the 2.7
> release. But they'd better answer by themselves :)

I'm willing to help out if needed. Can't hurt to have
another set of hands :) I'm sure there are others in the
Mercurial/Python community that would be willing to help too!

cheers
james

From martin at v.loewis.de  Sat Jun 19 15:07:33 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 15:07:33 +0200
Subject: [Python-Dev] Mercurial
In-Reply-To: <20100619083317.11355342@heresy>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100619135104.5b0f22ed@pitrou.net>
	<20100619121302.GB12233@remy> <20100619083317.11355342@heresy>
Message-ID: <4C1CC115.60709@v.loewis.de>

Am 19.06.2010 14:33, schrieb Barry Warsaw:
> On Jun 19, 2010, at 05:43 PM, Senthil Kumaran wrote:
>
>> On Sat, Jun 19, 2010 at 01:51:04PM +0200, Antoine Pitrou wrote:
>>> FWIW, the EOL extension is now part of Mercurial:
>>> http://mercurial.selenic.com/wiki/EolExtension
>>
>> Should we all move soon now?
>> Any target date you have in mind, Antoine?
>
> I believe the plan was to migrate right after 2.7 final is released.

I don't think so. The last update to the plan that I know of was in

http://mail.python.org/pipermail/python-dev/2010-February/097497.html

and it said that we would migrate on May 1. This hasn't happened,
but there was no update to the plan since (that I know of).

> I hope
> that is still the plan.  Since that is only 2 weeks away, are we ready?

Not nearly. AFAICT, the conversion process isn't complete yet, and the 
hook scripts are missing. Also, I would really like to see a /final/
demo installation *before* the switchover; because these things are all
missing, the final demo installation is missing, as well.

Regards,
Martin

From arcriley at gmail.com  Sat Jun 19 15:09:55 2010
From: arcriley at gmail.com (Arc Riley)
Date: Sat, 19 Jun 2010 09:09:55 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
Message-ID: <AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>

Just because legacy Python needs to be kept around for a bit longer for a
few uses does not mean that "Python 3 is not ready yet".  Any decent package
system can have two or more versions of Python installed at the same time.

It is not "critical self-evaluation" to repeat "Python 3 is not ready" as
litany in #Python and your supporting website.  I use the word "litany" here
because #Python refers users to what appears to be a religious website
http://python-commandments.org/python3.html

I have further witnessed (and even been the other party to) you and other
ops in #Python telling package developers, who have clearly said that they
are working to port their legacy package to Py3, that "Python 3 is not
ready".  One of our Summer of Code students this year actually included in
his application that he was told (strongly) in #Python that he shouldn't be
working with Py3 - even after he expressed his intent to apply under the PSF
to help with the Py3 migration effort as his project.

Besides rally against it what have you, as a Twisted developer, done
regarding the Python 3 migration process?


On Sat, Jun 19, 2010 at 8:12 AM, <exarkun at twistedmatrix.com> wrote:

> On 10:59 am, arcriley at gmail.com wrote:
>
>> You mean Twisted support, because library support is at the point where
>> there are fewer actively maintained packages not yet ported than those
>> which
>> are.  Of course if your Python experience is hyper-focused to one
>> framework
>> that isn't ported yet, it will certainly seem like a lot, and you guys who
>> run #Python are clearly hyper-focused on Twisted.
>>
>
> Arc,
>
> This isn't about Twisted.  Let's not waste everyone's time by trying to
> make it into a conflict between Twisted users and the rest of the Python
> community.
>
> You listed six other major packages that you yourself use that aren't
> available on Python 3 yet, so why are you trying to say here that this is
> all about Twisted?
>
>> [snip]
>>
>>
>> This anti-Py3 rhetoric is damaging to the community and needs to stop.
>> We're moving forward toward Python 3.2 and beyond, complaining about it
>> only
>> saps valuable developer time (including your own) from getting these
>> libraries you need ported faster.
>>
>
> No, it's not damaging.  Critical self-evaluation is a useful tool. Trying
> to silence differing perspectives is what's damaging to the community.
>
> Jean-Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/d9a927cb/attachment-0001.html>

From martin at v.loewis.de  Sat Jun 19 15:11:40 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 15:11:40 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email	package
 status in 3.X)
In-Reply-To: <20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
Message-ID: <4C1CC20C.7060709@v.loewis.de>

>> This anti-Py3 rhetoric is damaging to the community and needs to stop.
>> We're moving forward toward Python 3.2 and beyond, complaining about
>> it only
>> saps valuable developer time (including your own) from getting these
>> libraries you need ported faster.
>
> No, it's not damaging. Critical self-evaluation is a useful tool.

It's useful only if constructive. Stating a problem is, in itself,
just frustrating. One needs to accompany it with proposals of actions.

In the specific case, I'm optimistic, though. 2.7 will be the last
release of 2.x, so it will then be easier to focus on fixing the 3.x
bugs.

Regards,
Martin

From martin at v.loewis.de  Sat Jun 19 15:23:25 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 19 Jun 2010 15:23:25 +0200
Subject: [Python-Dev] Mercurial
In-Reply-To: <AANLkTin1isuBk2QaHdZ1xxEhMPnH6Vs7OxPb8_U9aCCQ@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com>
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100619135104.5b0f22ed@pitrou.net>
	<20100619121302.GB12233@remy> <20100619144218.4209e881@pitrou.net>
	<AANLkTin1isuBk2QaHdZ1xxEhMPnH6Vs7OxPb8_U9aCCQ@mail.gmail.com>
Message-ID: <4C1CC4CD.6080801@v.loewis.de>

Am 19.06.2010 15:05, schrieb James Mills:
> On Sat, Jun 19, 2010 at 10:42 PM, Antoine Pitrou<solipsis at pitrou.net>  wrote:
>> I should point out that I am in no way responsible for the migration.
>> I think Dirkjan and Brett said they would tackle this after the 2.7
>> release. But they'd better answer by themselves :)
>
> I'm willing to help out if needed. Can't hurt to have
> another set of hands :) I'm sure there are others in the
> Mercurial/Python community that would be willing to help too!

Take a look at

http://hg.python.org/pymigr/

What I *think* is missing is all the hook scripts (but you would need to 
check with Dirkjan whether they are already somewhere).

In theory, I would expect that you can run this migration suite 
yourself, and get a working installation - but I never tried myself.

See also PEP 385, which is the master plan. I'm not sure whether the
approach to branches has been approved (or who could really approve it);
I just notice that the current conversion produces a ridiculously large
repository (which fails to download with older versions of hg because
of size).

On the meta level, what seems to be missing as well is a clear view on
what the status is - so if you manage to get it working somehow, don't
forget to post what you think the status is.

Regards,
Martin

From g.brandl at gmx.net  Sat Jun 19 15:43:55 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sat, 19 Jun 2010 15:43:55 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
Message-ID: <hvihli$4r7$1@dough.gmane.org>

Am 19.06.2010 15:09, schrieb Arc Riley:
> Just because legacy Python needs to be kept around for a bit longer for
> a few uses does not mean that "Python 3 is not ready yet".  Any decent
> package system can have two or more versions of Python installed at the
> same time.
> 
> It is not "critical self-evaluation" to repeat "Python 3 is not ready"
> as litany in #Python and your supporting website.  I use the word
> "litany" here because #Python refers users to what appears to be a
> religious website http://python-commandments.org/python3.html
> 
> I have further witnessed (and even been the other party to) you and
> other ops in #Python telling package developers, who have clearly said
> that they are working to port their legacy package to Py3, that "Python
> 3 is not ready".  One of our Summer of Code students this year actually
> included in his application that he was told (strongly) in #Python that
> he shouldn't be working with Py3 - even after he expressed his intent to
> apply under the PSF to help with the Py3 migration effort as his project.

Ouch.  Looks like it's time for the PSU to release the 10-ton wei


From tseaver at palladion.com  Sat Jun 19 15:57:47 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Sat, 19 Jun 2010 09:57:47 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C1BEE60.4040508@voidspace.org.uk>
References: <medczp3dq3tj4doi18062010025250@SMTP>
	<4C1BEE60.4040508@voidspace.org.uk>
Message-ID: <hviics$6vf$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Michael Foord wrote:

> I didn't make myself clear. The expected disappointment I was referring 
> to was about the rate of adoption, not about the quality of the product.
> 
> I'm still baffled as to how a bug in the cgi module (along with the 
> acknowledged email problems) is such a big deal. Was it reported and 
> then languished in the bug tracker? That would be bad ion its own but if 
> it was only recently discovered that indicates that it probably isn't 
> such a big deal - either way it needs fixing, but using Python for 
> writing cgis hasn't been a big use case for a long time.

FWIW:  some APIs in the cgi module is actually used by a number of
Python2 web frameworks and libraries:  Paste, for instance, uses it, and
is in turn used by BFG, Pylons, TurboGears.  Zope has used it that way
since for ever.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwczNsACgkQ+gerLs4ltQ7IjACfVcUshd10OQfZJqLMmU5p1nZ6
5OcAmwSsn7+q1GO67I1HuOH1waEDI8v/
=1geT
-----END PGP SIGNATURE-----


From stephen at xemacs.org  Sat Jun 19 15:55:29 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 19 Jun 2010 22:55:29 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <h3sa87mevl05p5ro18062010012216@SMTP>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
Message-ID: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>

lutz at rmi.net writes:

 > I agree that 3.X isn't all bad, and I very much hope it succeeds.  And 
 > no, I have no answers; I'm just reporting the perception from downwind.

The fact is, though, that many of your "downwind" readers are not the
audience for Python 3, not yet.  If you want to do Python 3 a favor,
make sure that they understand that Python 3 is *not* an "upgrade" of
Python 2.  It's a hard task for you, but IMO one strategy is to write
in the style that we wrote the DVCS PEP (#374) in: here's how you do
the same task in these similar languages.  And just as git and Bazaar
turned out to have fatal defects in terms of adoption *in that time
frame*, Python 3 is not yet adoptable for many, many users.

Python 3 is a Python-2-like language, but even though it's built on
the same design principles, and uses nearly identical syntax, there
are fundamental differences.  And it is *very* young.  So it's a new
language and should be approached in the same way as any new language.
Try it on non-mission critical projects, on projects where its library
support has a good reputation, etc.  Many of your readers have no time
(or perhaps no approval "from upstairs") for that kind of thing.  Too
bad, but that's what happens to every great new language.

 > So here it is: The prevailing view is that 3.X developers hoisted things
 > on users that they did not fully work through themselves.  Unicode is 
 > prime among these: for all the talk here about how 2.X was broken in 
 > this regard, the implications of the 3.X string solution remain to be
 > fully resolved in the 3.X standard library to this day.  What is a 
 > common Python user to make of that?

Why should she make anything of that?  Python 3 is a *new* language,
possibly as different from Python 2 as C++ was from C (and *more*
different in terms of fundamental incompatibilities).  And as long as
C++ was almost entirely dependent on C libraries, there were problems.
(Not to mention that even today there are plenty of programmers who
are proud to be C programmers, not C++ programmers.)  Today, Python 3
is entirely dependent on Python 2 libraries.  It's human to hope there
will be no problems, but not realistic.

BTW, I think what you're missing is that you're wrong about the money.
Python 3 is still about the fun and the code.  "Fun and code" are why
the core developers spent about five years developing it, because
doing that was fun, because the new code has high value as code, and
because it promised *them* a more fun and more productive future.

Library support, on the other hand, *is* about money.  Your readers,
down in the trenches of WWW, intraweb, and sysadmin implementation and
support, depend on robust libraries to get their day jobs done.  They
really don't care that writing Python 3 was fun, and that programming
in Python 3 is more fun than ever.  That doesn't compensate for even
one lingering str/bytes bogosity to most of them, and since they don't
get paid for fixing Python library bugs, they don't, and they're in no
mood to *forgive* any, either.

So tell users who feel that way to use Python 2, for now, and check on
Python 3 progress every 6 months or so.  And users who are just a bit
more adventurous to stick to applications where the libraries already
have a good reputation *in Python 3*.  It's as simple as that, I think.

Regards,


From tseaver at palladion.com  Sat Jun 19 16:13:34 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Sat, 19 Jun 2010 10:13:34 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
Message-ID: <hvijae$9tc$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jesse Noller wrote:
> On Fri, Jun 18, 2010 at 4:48 PM, P.J. Eby <pje at telecommunity.com> wrote:
>> At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>>> So here it is: The prevailing view is that 3.X developers hoisted things
>>> on users that they did not fully work through themselves.  Unicode is
>>> prime among these: for all the talk here about how 2.X was broken in
>>> this regard, the implications of the 3.X string solution remain to be
>>> fully resolved in the 3.X standard library to this day.  What is a
>>> common Python user to make of that?
>> Certainly, this was my impression as well, after all the Web-SIG discussions
>> regarding the state of the stdlib in 3.x with respect to URL parsing,
>> joining, opening, etc.
> 
> Nothing is set in stone; if something is incredibly painful, or worse
> yet broken, then someone needs to file a bug, bring it to this list,
> or bring up a patch.

Or walk away.

> This is code we're talking about - nothing is set
> in stone, and if something is criminally broken it needs to be first
> identified, and then fixed.
> 
>> To be honest, I'm waiting to see some sort of tutorial(s) for using 3.x that
>> actually addresses these kinds of stdlib usage issues, so that I don't have
>> to think about it or futz around with experimenting, possibly to find that
>> some things can't be done at all.
> 
> I guess tutorial welcome, rather than patch welcome then ;)

The only folks who can write the tutorial are the ones who have already
drunk the koolaid.  Note that I've been making my living with Python for
about twelve years now, and would *like* to use Python3, but can't, yet,
and therefore haven't taken the first sip.

>> IOW, 3.x has broken TOOOWTDI for me in some areas.  There may be obvious
>> ways to do it, but, as per the Zen of Python, "that way may not be obvious
>> at first unless you're Dutch".  ;-)
> 
> What areas. We need specifics which can either be:
> 
> 1> Shot down.
> 2> Turned into bugs, so they can be fixed
> 3> Documented in the core documentation.

That's bloody ironic in a thread which had pointed at reasons why people
are not even considering Py3 for their projects:  those folks won't even
find the issues due to the lack of confidence in the suitability of the
platform.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwc0I0ACgkQ+gerLs4ltQ6aDgCguYv+BXou0a42Yi7ERGCHOfIv
6REAnjejq4LDbE9c/gCqB+xs1yGfQ4KR
=/9fw
-----END PGP SIGNATURE-----


From exarkun at twistedmatrix.com  Sat Jun 19 16:28:00 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Sat, 19 Jun 2010 14:28:00 -0000
Subject: [Python-Dev] Python Library Support in 3.x (Was: email
	package	status in 3.X)
In-Reply-To: <AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
Message-ID: <20100619142800.2412.647073227.divmod.xquotient.206@localhost.localdomain>

On 01:09 pm, arcriley at gmail.com wrote:
>[snip]
>It is not "critical self-evaluation" to repeat "Python 3 is not ready" 
>as
>litany in #Python and your supporting website.  I use the word "litany" 
>here
>because #Python refers users to what appears to be a religious website
>http://python-commandments.org/python3.html

It's not my website.  I don't own the domain, I don't control the 
hosting, I didn't generate the content, I have no access to change 
anything on it.  I've barely even frequent #python in the last three 
years.

Perhaps you were directing those comments at Stephen Thorne though 
(although I don't know if he's any more involved in it than I am so 
don't take this as anything but idle speculation).
>I have further witnessed (and even been the other party to) you and 
>other
>ops in #Python telling package developers, who have clearly said that 
>they
>are working to port their legacy package to Py3, that "Python 3 is not
>ready".

I'm not going to condone or condemn events which I didn't observe.

However you've never witnessed me discouraging developers who were 
actively porting software to Python 3 because I've never done it.  I'm 
sure this was an honest mistake and you simply confused me with someone 
else.
>Besides rally against it what have you, as a Twisted developer, done
>regarding the Python 3 migration process?

This, however, I find extremely insulting.  I don't answer to you.  The 
only reason I'm replying at all is to correct the two pieces of 
misinformation in your message.

I don't see how this discussion can go anywhere productive, so I'll do 
my best to make this my last post on the subject.  Obviously I made a 
mistake posting to the thread at all.

Jean-Paul

From jnoller at gmail.com  Sat Jun 19 16:59:18 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sat, 19 Jun 2010 10:59:18 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvijae$9tc$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>
Message-ID: <609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>


On Jun 19, 2010, at 10:13 AM, Tres Seaver <tseaver at palladion.com> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Jesse Noller wrote:
>> On Fri, Jun 18, 2010 at 4:48 PM, P.J. Eby <pje at telecommunity.com>  
>> wrote:
>>> At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>>>> So here it is: The prevailing view is that 3.X developers hoisted  
>>>> things
>>>> on users that they did not fully work through themselves.   
>>>> Unicode is
>>>> prime among these: for all the talk here about how 2.X was broken  
>>>> in
>>>> this regard, the implications of the 3.X string solution remain  
>>>> to be
>>>> fully resolved in the 3.X standard library to this day.  What is a
>>>> common Python user to make of that?
>>> Certainly, this was my impression as well, after all the Web-SIG  
>>> discussions
>>> regarding the state of the stdlib in 3.x with respect to URL  
>>> parsing,
>>> joining, opening, etc.
>>
>> Nothing is set in stone; if something is incredibly painful, or worse
>> yet broken, then someone needs to file a bug, bring it to this list,
>> or bring up a patch.
>
> Or walk away.
>

Ok. If you want.

>> This is code we're talking about - nothing is set
>> in stone, and if something is criminally broken it needs to be first
>> identified, and then fixed.
>>
>>> To be honest, I'm waiting to see some sort of tutorial(s) for  
>>> using 3.x that
>>> actually addresses these kinds of stdlib usage issues, so that I  
>>> don't have
>>> to think about it or futz around with experimenting, possibly to  
>>> find that
>>> some things can't be done at all.
>>
>> I guess tutorial welcome, rather than patch welcome then ;)
>
> The only folks who can write the tutorial are the ones who have  
> already
> drunk the koolaid.  Note that I've been making my living with Python  
> for
> about twelve years now, and would *like* to use Python3, but can't,  
> yet,
> and therefore haven't taken the first sip.

Why can't you? Is it a bug? Let's file it and fix it. Is it that you  
need a dependency ported? Cool - let's bring it up to the maintainers,  
or this list, or ask the PSF to push resources into helping port.  
Anything but nothing.

If what you're saying is that python 3 is a completely unsuitable  
platform, well, then yeah - we can all "fix" it or walk away.

>
>>> IOW, 3.x has broken TOOOWTDI for me in some areas.  There may be  
>>> obvious
>>> ways to do it, but, as per the Zen of Python, "that way may not be  
>>> obvious
>>> at first unless you're Dutch".  ;-)
>>
>> What areas. We need specifics which can either be:
>>
>> 1> Shot down.
>> 2> Turned into bugs, so they can be fixed
>> 3> Documented in the core documentation.
>
> That's bloody ironic in a thread which had pointed at reasons why  
> people
> are not even considering Py3 for their projects:  those folks won't  
> even
> find the issues due to the lack of confidence in the suitability of  
> the
> platform.

What I saw was a thread about some issues in email, and cgi. We have  
some work being done to address the issue. This will help resolve some  
of the issues.

I'd there are other issues, then we should step up and either help, or  
get out ofthe way. Arguing about the viability of a platform we knew  
would take a bit for adoption is silly and breeds ill will.

It's not a turd, and it's not hopeless, in fact rumor has it NumPy  
will be ported soon which is a major stepping stone.

  The only way to counteract this meme that python 3 is horribly  
broken is to prove that it's not, fix bugs, and move on. There's no  
point debating relative turdiness here.

Jesse

From jnoller at gmail.com  Sat Jun 19 17:07:06 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sat, 19 Jun 2010 11:07:06 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>
	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
Message-ID: <AANLkTinpF5NVdBDQh6XY6zLgoBMitaysJuGEwcJ0_xGv@mail.gmail.com>

On Sat, Jun 19, 2010 at 10:59 AM, Jesse Noller <jnoller at gmail.com> wrote:
>
>
> On Jun 19, 2010, at 10:13 AM, Tres Seaver <tseaver at palladion.com> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> Jesse Noller wrote:
>>>
>>> On Fri, Jun 18, 2010 at 4:48 PM, P.J. Eby <pje at telecommunity.com> wrote:
>>>>
>>>> At 05:22 PM 6/18/2010 +0000, lutz at rmi.net wrote:
>>>>>
>>>>> So here it is: The prevailing view is that 3.X developers hoisted
>>>>> things
>>>>> on users that they did not fully work through themselves. ?Unicode is
>>>>> prime among these: for all the talk here about how 2.X was broken in
>>>>> this regard, the implications of the 3.X string solution remain to be
>>>>> fully resolved in the 3.X standard library to this day. ?What is a
>>>>> common Python user to make of that?
>>>>
>>>> Certainly, this was my impression as well, after all the Web-SIG
>>>> discussions
>>>> regarding the state of the stdlib in 3.x with respect to URL parsing,
>>>> joining, opening, etc.
>>>
>>> Nothing is set in stone; if something is incredibly painful, or worse
>>> yet broken, then someone needs to file a bug, bring it to this list,
>>> or bring up a patch.
>>
>> Or walk away.
>>
>
> Ok. If you want.
>
>>> This is code we're talking about - nothing is set
>>> in stone, and if something is criminally broken it needs to be first
>>> identified, and then fixed.
>>>
>>>> To be honest, I'm waiting to see some sort of tutorial(s) for using 3.x
>>>> that
>>>> actually addresses these kinds of stdlib usage issues, so that I don't
>>>> have
>>>> to think about it or futz around with experimenting, possibly to find
>>>> that
>>>> some things can't be done at all.
>>>
>>> I guess tutorial welcome, rather than patch welcome then ;)
>>
>> The only folks who can write the tutorial are the ones who have already
>> drunk the koolaid. ?Note that I've been making my living with Python for
>> about twelve years now, and would *like* to use Python3, but can't, yet,
>> and therefore haven't taken the first sip.
>
> Why can't you? Is it a bug? Let's file it and fix it. Is it that you need a
> dependency ported? Cool - let's bring it up to the maintainers, or this
> list, or ask the PSF to push resources into helping port. Anything but
> nothing.
>
> If what you're saying is that python 3 is a completely unsuitable platform,
> well, then yeah - we can all "fix" it or walk away.
>
>>
>>>> IOW, 3.x has broken TOOOWTDI for me in some areas. ?There may be obvious
>>>> ways to do it, but, as per the Zen of Python, "that way may not be
>>>> obvious
>>>> at first unless you're Dutch". ?;-)
>>>
>>> What areas. We need specifics which can either be:
>>>
>>> 1> Shot down.
>>> 2> Turned into bugs, so they can be fixed
>>> 3> Documented in the core documentation.
>>
>> That's bloody ironic in a thread which had pointed at reasons why people
>> are not even considering Py3 for their projects: ?those folks won't even
>> find the issues due to the lack of confidence in the suitability of the
>> platform.
>
> What I saw was a thread about some issues in email, and cgi. We have some
> work being done to address the issue. This will help resolve some of the
> issues.
>
> I'd there are other issues, then we should step up and either help, or get
> out ofthe way. Arguing about the viability of a platform we knew would take
> a bit for adoption is silly and breeds ill will.
>

s/I'd/If - stupid phone.

From arcriley at gmail.com  Sat Jun 19 17:14:51 2010
From: arcriley at gmail.com (Arc Riley)
Date: Sat, 19 Jun 2010 11:14:51 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <20100619142800.2412.647073227.divmod.xquotient.206@localhost.localdomain>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com> 
	<20100619142800.2412.647073227.divmod.xquotient.206@localhost.localdomain>
Message-ID: <AANLkTillms0Tv61NVT7tFLsxmuleMTF4s8SF57hqKIL2@mail.gmail.com>

python-commandments.org is owned and hosted by the same person (Allen Short
aka dash aka washort) as pound-python.org which is the "official" website
for #Python and which links to it.

#Python is co-managed by Stephen Thorne (aka Jerub) and Allen Short (aka
dash aka washort).  According to Freenode services, the channel operators
include more than half the active Twisted Matrix developers, including
yourself.  Each of you has had the ability to change the topic at any time.

I may have cast an overly broad net in including you, I don't have IRC logs
to review.  I do remember that you have contributed a great deal of time to
helping people in #Python and that you were fairly active as a channel
operator in #Python when the anti-Py3 rhetoric got started.  Perhaps you can
shine some light on who is actually responsible for promoting this?

I'm sorry if we're in uncomfortable finger-pointing mode, but in the spirit
of critical self-evaluation I think its time we take a long look at who is
actually representing the Python community in operating our primary
community help channel and whether that situation should continue.


On Sat, Jun 19, 2010 at 10:28 AM, <exarkun at twistedmatrix.com> wrote:

> On 01:09 pm, arcriley at gmail.com wrote:
>
>> [snip]
>>
>> It is not "critical self-evaluation" to repeat "Python 3 is not ready" as
>> litany in #Python and your supporting website.  I use the word "litany"
>> here
>> because #Python refers users to what appears to be a religious website
>> http://python-commandments.org/python3.html
>>
>
> It's not my website.  I don't own the domain, I don't control the hosting,
> I didn't generate the content, I have no access to change anything on it.
>  I've barely even frequent #python in the last three years.
>
> Perhaps you were directing those comments at Stephen Thorne though
> (although I don't know if he's any more involved in it than I am so don't
> take this as anything but idle speculation).
>
>  I have further witnessed (and even been the other party to) you and other
>> ops in #Python telling package developers, who have clearly said that they
>> are working to port their legacy package to Py3, that "Python 3 is not
>> ready".
>>
>
> I'm not going to condone or condemn events which I didn't observe.
>
> However you've never witnessed me discouraging developers who were actively
> porting software to Python 3 because I've never done it.  I'm sure this was
> an honest mistake and you simply confused me with someone else.
>
>  Besides rally against it what have you, as a Twisted developer, done
>> regarding the Python 3 migration process?
>>
>
> This, however, I find extremely insulting.  I don't answer to you.  The
> only reason I'm replying at all is to correct the two pieces of
> misinformation in your message.
>
> I don't see how this discussion can go anywhere productive, so I'll do my
> best to make this my last post on the subject.  Obviously I made a mistake
> posting to the thread at all.
>
> Jean-Paul
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/bcb8c6ec/attachment-0001.html>

From solipsis at pitrou.net  Sat Jun 19 17:43:26 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 19 Jun 2010 17:43:26 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<20100619142800.2412.647073227.divmod.xquotient.206@localhost.localdomain>
	<AANLkTillms0Tv61NVT7tFLsxmuleMTF4s8SF57hqKIL2@mail.gmail.com>
Message-ID: <20100619174326.22cac1a3@pitrou.net>

On Sat, 19 Jun 2010 11:14:51 -0400
Arc Riley <arcriley at gmail.com> wrote:
> python-commandments.org is owned and hosted by the same person (Allen Short
> aka dash aka washort) as pound-python.org which is the "official" website
> for #Python and which links to it.
> 
> #Python is co-managed by Stephen Thorne (aka Jerub) and Allen Short (aka
> dash aka washort).  According to Freenode services, the channel operators
> include more than half the active Twisted Matrix developers, including
> yourself.  Each of you has had the ability to change the topic at any time.

I don't think it's constructive to treat the Twisted developers as an
uniform society.

I would expect #python (which I don't think I have ever participated
in) to function like any community, where you don't make unilateral
changes if others disagree with you. Jean-Paul said ?I've barely even
frequent #python in the last three years?. Knowing this, I don't know
how he could impose a topic change on his own.

> I'm sorry if we're in uncomfortable finger-pointing mode, but in the spirit
> of critical self-evaluation I think its time we take a long look at who is
> actually representing the Python community in operating our primary
> community help channel and whether that situation should continue.

Well, perhaps, but whether Python 3 is misrepresented shouldn't be the
only metric, then.

Regards

Antoine.


From pje at telecommunity.com  Sat Jun 19 18:07:43 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sat, 19 Jun 2010 12:07:43 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100619160755.4E50C3A4060@sparrow.telecommunity.com>

At 10:55 PM 6/19/2010 +0900, Stephen J. Turnbull wrote:
>They really don't care that writing Python 3 was fun, and that 
>programming in Python 3 is more fun than ever.  That doesn't 
>compensate for even one lingering str/bytes bogosity to most of 
>them, and since they don't get paid for fixing Python library bugs, 
>they don't, and they're in no mood to *forgive* any, either.

This is pretty much where I'm at, except that the only potential fun 
increase Py3 appears to offer me are argument annotations and 
keyword-only args -- but these are partly balanced by the loss of 
argument tuple unpacking.  The metaclass keyword argument is nice, 
but the loss of dynamically-settable __metaclass__ is just plain annoying.

Really, just about everything that Py3 offers in the way of added 
fun, seems offset by a matching loss somewhere else.  So it's hard to 
get excited about it - it seems like, "ho hum, a new language that's 
kind of like Python, but just different enough to be annoying."

OTOH, I don't know what to do about that, besides adding some sort of 
"killer app" feature that makes Python 3 the One Obvious Way to do 
some specific application domain.


From breamoreboy at yahoo.co.uk  Sat Jun 19 18:28:17 2010
From: breamoreboy at yahoo.co.uk (Mark Lawrence)
Date: Sat, 19 Jun 2010 17:28:17 +0100
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <hvihli$4r7$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvihli$4r7$1@dough.gmane.org>
Message-ID: <hvir5q$1eh$1@dough.gmane.org>

On 19/06/2010 14:43, Georg Brandl wrote:
> Am 19.06.2010 15:09, schrieb Arc Riley:
>> Just because legacy Python needs to be kept around for a bit longer for
>> a few uses does not mean that "Python 3 is not ready yet".  Any decent
>> package system can have two or more versions of Python installed at the
>> same time.
>>
>> It is not "critical self-evaluation" to repeat "Python 3 is not ready"
>> as litany in #Python and your supporting website.  I use the word
>> "litany" here because #Python refers users to what appears to be a
>> religious website http://python-commandments.org/python3.html
>>
>> I have further witnessed (and even been the other party to) you and
>> other ops in #Python telling package developers, who have clearly said
>> that they are working to port their legacy package to Py3, that "Python
>> 3 is not ready".  One of our Summer of Code students this year actually
>> included in his application that he was told (strongly) in #Python that
>> he shouldn't be working with Py3 - even after he expressed his intent to
>> apply under the PSF to help with the Py3 migration effort as his project.
>
> Ouch.  Looks like it's time for the PSU to release the 10-ton wei
>

Please raise a new issue, the weight should be 16 ton to conform to 
Python standards.

Cheers.

Mark Lawrence.


From debatem1 at gmail.com  Sat Jun 19 21:02:53 2010
From: debatem1 at gmail.com (geremy condra)
Date: Sat, 19 Jun 2010 12:02:53 -0700
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTillms0Tv61NVT7tFLsxmuleMTF4s8SF57hqKIL2@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<20100619142800.2412.647073227.divmod.xquotient.206@localhost.localdomain>
	<AANLkTillms0Tv61NVT7tFLsxmuleMTF4s8SF57hqKIL2@mail.gmail.com>
Message-ID: <AANLkTinqHVpcWHKO4sy8QlGSMCpFiTNioClv5PPc61AL@mail.gmail.com>

On Sat, Jun 19, 2010 at 8:14 AM, Arc Riley <arcriley at gmail.com> wrote:
> python-commandments.org is owned and hosted by the same person (Allen Short
> aka dash aka washort) as pound-python.org which is the "official" website
> for #Python and which links to it.
>
> #Python is co-managed by Stephen Thorne (aka Jerub) and Allen Short (aka
> dash aka washort).? According to Freenode services, the channel operators
> include more than half the active Twisted Matrix developers, including
> yourself.? Each of you has had the ability to change the topic at any time.
>
> I may have cast an overly broad net in including you, I don't have IRC logs
> to review.? I do remember that you have contributed a great deal of time to
> helping people in #Python and that you were fairly active as a channel
> operator in #Python when the anti-Py3 rhetoric got started.? Perhaps you can
> shine some light on who is actually responsible for promoting this?
>
> I'm sorry if we're in uncomfortable finger-pointing mode, but in the spirit
> of critical self-evaluation I think its time we take a long look at who is
> actually representing the Python community in operating our primary
> community help channel and whether that situation should continue.

Amen. I've heard about people being told not to use python3 on the
irc *way* too many times for it to be all make believe.

Geremy Condra

From simon at ikanobori.jp  Sat Jun 19 21:55:34 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Sat, 19 Jun 2010 21:55:34 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
Message-ID: <1A151F99-EAA2-4897-88B5-B3E765AF66BD@ikanobori.jp>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Dear all,
Sorry for the maybe somewhat late response but I am not a subscriber  
on the python-dev mailinglists. Someone else pointed me towards this  
thread and I want to shortly clarify a few things regarding the  
following two statements:
> It is not "critical self-evaluation" to repeat "Python 3 is not  
> ready" as
> litany in #Python and your supporting website.  I use the word  
> "litany" here
> because #Python refers users to what appears to be a religious website
> http://python-commandments.org/python3.html

> python-commandments.org is owned and hosted by the same person  
> (Allen Short
> aka dash aka washort) as pound-python.org which is the "official"  
> website
> for #Python and which links to it.

Both python-commandments.org and pound-python.org are my websites. I  
own both the domains and I do all administrative tasks regarding these  
domains.
pound-python.org is the official #python website and as such is  
maintained on Launchpad by a team of volunteers, see: https://launchpad.net/ 
~pound-python which is indeed owned by Allen Short.
However, Allen Short has nothing to do with the Python Commandments  
page. That is an endeavor for which I am the sole responsible person.  
I have asked some people to contribute texts but that doesn't change  
that I should be spoken to regarding the content on that website.
If there are any issues with the content on either website please do  
not hesitate to contact me at this email address or on IRC where I go  
by the nickname of ikanobori.
As for the potentially harmful text on Python 3 which is included on  
the python-commandments website I do get the hint that it might not be  
clear enough that the text does not apply to people who are porting  
libraries. This is a complaint I have heard before and to which I will  
take affirmative action by explicitly adding text to clarify that.
Hope all is well,
Regards,
Simon de Vlieger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQIcBAEBAgAGBQJMHSC2AAoJEBBSHP7i+JXf5pMQANPBCUzDwx2xjTP8shA1E4mx
7/OQk27nxt+wOZNT0Ybe/iNXLetF6qa8At7kTau/yU3l/xJWVODjfJUICkDv/0ad
ebMKiFeKO8jqdvEe+RL3ck7jTXEM73C2PLNtge9FLTY6HhYrXnOJakNbpWPJR/PG
TQQ+mY/8ZvSP+n98RrY9kcVaVJMSmXUJWHvWVh+LkcIDwF/h30EH/e5PUGzylINI
NiV5955pNRXTnwdgjsouljUI/rrod3zphnUEyL22QvSUx0b7YXMfC24eRGTpwrLg
9cyQAMjjbuVqkhSJhYFnm+DKwsZEAHxxOvu50Xwuy3i1C7c8L6/QDT1txoSTVuaP
4xw8GSFEblbHviz7hY7KCe5nMpBNHNfcGFHFSWd+WYogRXjpDitlMDNW8HT56pRW
lwzs1WENnoOSCAn4Xds+xPJj9JyAGnS8rWz70RVMyrkHDFaJhDlIDNpEFdlAlywT
R0uCQrlxs/uWzAXK2IA0wXPtm/m8fYLR3q8mD4++QotZKQcT4ciN7Xv913/ZT2b2
NtR1WEoTZAV+gWrFyFsgmMFAmZhvUdI8Ludxs3l2smHHaCFUkj2Ur9BrkMiEv5Z8
wLN+/LRaHgGnmVT2SF0LOCeOLz97dP728OKBO0DwxqT89Cla8445z7ktdHnJ3amA
gjbsfG7W+yx9L2v0IDFC
=YDiR
-----END PGP SIGNATURE-----

From turnbull at sk.tsukuba.ac.jp  Sat Jun 19 22:23:09 2010
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Sun, 20 Jun 2010 05:23:09 +0900
Subject: [Python-Dev] Python Library Support in 3.x
In-Reply-To: <1A151F99-EAA2-4897-88B5-B3E765AF66BD@ikanobori.jp>
References: <1A151F99-EAA2-4897-88B5-B3E765AF66BD@ikanobori.jp>
Message-ID: <87hbkydb6a.fsf@uwakimon.sk.tsukuba.ac.jp>

Simon de Vlieger writes:

 > As for the potentially harmful text on Python 3 which is included on  
 > the python-commandments website I do get the hint that it might not be  
 > clear enough that the text does not apply to people who are porting  
 > libraries. 

It also doesn't apply to people who don't need unported libraries, eg,
where the task is plain old text filtering or command line scripting.
Don't ask me for the list of "unported libraries", I know of none from
personal experience.<wink>

You might also want to withdraw the claim that Python 2.x is actively
developed.  With the release of 2.7, that's not true any more, not in
the sense that most people think of "actively developed."


From alexander.belopolsky at gmail.com  Sat Jun 19 22:43:18 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Jun 2010 16:43:18 -0400
Subject: [Python-Dev] Year 0 and year 10,000 in timetuple
Message-ID: <AANLkTin7CycVLkMOLPfjZOkRpkXJy7IS-6qEfudQrkC-@mail.gmail.com>

While datetime range is limited to years from 1 through 9999, it is
possible to produce time tuple with year 0 or year 10,000:

>>> t1 = datetime.min.replace(tzinfo=timezone.max)
>>> t2 = datetime.max.replace(tzinfo=timezone.min)
>>> t1.utctimetuple().tm_year
0
>>> t2.utctimetuple().tm_year
10000

Most if not all functions consuming timetuples are not designed to
handle years beyond 9999 and such timetuples cannot be converted back
to datetime.

I would like to make utctimetuple() method to raise OverflowError on
values like t1 or t2 above.  These values are most certainly a mistake
in application ad it is better to detect them earlier before they make
their way into system functions that cannot handle them.

See issues 9005 and 6608 on the tracker.

http://bugs.python.org/issue9005
http://bugs.python.org/issue6608

From guido at python.org  Sun Jun 20 00:12:29 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 19 Jun 2010 15:12:29 -0700
Subject: [Python-Dev] Year 0 and year 10,000 in timetuple
In-Reply-To: <AANLkTin7CycVLkMOLPfjZOkRpkXJy7IS-6qEfudQrkC-@mail.gmail.com>
References: <AANLkTin7CycVLkMOLPfjZOkRpkXJy7IS-6qEfudQrkC-@mail.gmail.com>
Message-ID: <AANLkTims3IY7LoLFq8cfsjvuqBqzlOB6DNu0WcitqM4i@mail.gmail.com>

But what if they are used intentionally as "impossible" or sentinel values?

--Guido (on Android)

On Jun 19, 2010 2:37 PM, "Alexander Belopolsky" <
alexander.belopolsky at gmail.com> wrote:
> While datetime range is limited to years from 1 through 9999, it is
> possible to produce time tuple with year 0 or year 10,000:
>
>>>> t1 = datetime.min.replace(tzinfo=timezone.max)
>>>> t2 = datetime.max.replace(tzinfo=timezone.min)
>>>> t1.utctimetuple().tm_year
> 0
>>>> t2.utctimetuple().tm_year
> 10000
>
> Most if not all functions consuming timetuples are not designed to
> handle years beyond 9999 and such timetuples cannot be converted back
> to datetime.
>
> I would like to make utctimetuple() method to raise OverflowError on
> values like t1 or t2 above. These values are most certainly a mistake
> in application ad it is better to detect them earlier before they make
> their way into system functions that cannot handle them.
>
> See issues 9005 and 6608 on the tracker.
>
> http://bugs.python.org/issue9005
> http://bugs.python.org/issue6608
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
http://mail.python.org/mailman/options/python-dev/guido%40python.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/9b6ef443/attachment.html>

From raymond.hettinger at gmail.com  Sun Jun 20 00:27:11 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 19 Jun 2010 15:27:11 -0700
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvhal8$390$2@dough.gmane.org>
References: <medczp3dq3tj4doi18062010025250@SMTP>	<4C1BEE60.4040508@voidspace.org.uk>
	<606BEE82-71C7-4033-A273-F9E945ACDF90@gmail.com>
	<hvhal8$390$2@dough.gmane.org>
Message-ID: <BAE8BE95-A93C-4AD2-B83F-38B9734A0F39@gmail.com>


On Jun 18, 2010, at 7:39 PM, Terry Reedy wrote:

> On 6/18/2010 6:51 PM, Raymond Hettinger wrote:
>> There has been a disappointing
>> lack of bug reports across the board for 3.x.
> 
> Here is one from this week involving the interaction of array and bytearray. It needs a comment from someone who can understand the C-API based patch, which is beyond me.
> http://bugs.python.org/issue8990

I'll take a look at this one.


Raymond


P.S.  For those who are interested, here is the story on BeautifulSoup:
http://www.crummy.com/software/BeautifulSoup/3.1-problems.html

From alexander.belopolsky at gmail.com  Sun Jun 20 00:31:52 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sat, 19 Jun 2010 18:31:52 -0400
Subject: [Python-Dev] Year 0 and year 10,000 in timetuple
In-Reply-To: <AANLkTims3IY7LoLFq8cfsjvuqBqzlOB6DNu0WcitqM4i@mail.gmail.com>
References: <AANLkTin7CycVLkMOLPfjZOkRpkXJy7IS-6qEfudQrkC-@mail.gmail.com>
	<AANLkTims3IY7LoLFq8cfsjvuqBqzlOB6DNu0WcitqM4i@mail.gmail.com>
Message-ID: <A94EB78F-73B0-4A50-AEA1-F96615ADC0F5@gmail.com>


On Jun 19, 2010, at 6:12 PM, Guido van Rossum <guido at python.org> wrote:

> But what if they are used intentionally as "impossible" or sentinel  
> values?
>

That would be another reason not to produce them accidently.  Note  
that I am proposing disallowing production of out of range years from  
valid datetime objects, not consumption of them if that is allowed  
anywhere. 

From tjreedy at udel.edu  Sun Jun 20 02:02:03 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 19 Jun 2010 20:02:03 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
Message-ID: <hvjlpt$8pe$1@dough.gmane.org>

After reading the discussion in the previous thread, signed in to 
#python and verified that the intro message starts with a lie about 
python3. I also verified that the official #python site links to "Python 
Commandment Don't use Python 3? yet". The excuse that the negative 
commandment site is not part of the official site is does not wash. The 
#python site maintainer choose that as the authoritative word on the 
topic "On using Python 2.x or Python 3.x".

Since a fair, half-intelligent person would know that the usability of 
Python3 depends on the user, this all strikes as conscious sabotage.

To me, this, along with other reports, is really ugly. I do not wish to 
fight such people; but I would rather ask python3 questions in a pro- 
rather than anti-python3 atmosphere. #python is certainly not a place 
that I would refer new people to.

Given that the 'owners' of #python have been asked and refuse to remove 
their negative-opinion-stated-as-leading-headline-fact, it seems to me 
that we need a separate #python3 channel. The topic could be "Welcome to 
discussion of Python3, the latest, greated version of Python." The first 
link might be to the current stable Python3 docs. Hence the '!' in the 
subject line.

HoweverI have very little experience with IRC and consequently have 
little idea what getting a permanent, owned, channel like #python 
entails. Hence the '?' that follows.

What do others think?


From glyph at twistedmatrix.com  Sun Jun 20 02:24:07 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Sat, 19 Jun 2010 17:24:07 -0700
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvjlpt$8pe$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
Message-ID: <4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>

On Jun 19, 2010, at 5:02 PM, Terry Reedy wrote:

> HoweverI have very little experience with IRC and consequently have little idea what getting a permanent, owned, channel like #python entails. Hence the '?' that follows.
> 
> What do others think?

Sure, this is a good idea.

Technically speaking, this is extremely easy.  Somebody needs to "/msg chanserv register #python3" and that's about it.  (In this case, that "someone" may need to be Brett Cannon, since he is the official group contact for Freenode regarding Python-related channels.)

Practically speaking, you will need a group of at least a dozen contributors, each in a different timezone, who sit there all day answering questions :).  Otherwise the ownership of the channel is just a signpost pointing at an empty room.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/38fcc3fb/attachment-0001.html>

From debatem1 at gmail.com  Sun Jun 20 02:39:44 2010
From: debatem1 at gmail.com (geremy condra)
Date: Sat, 19 Jun 2010 17:39:44 -0700
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvjlpt$8pe$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
Message-ID: <AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>

On Sat, Jun 19, 2010 at 5:02 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> After reading the discussion in the previous thread, signed in to #python
> and verified that the intro message starts with a lie about python3. I also
> verified that the official #python site links to "Python Commandment Don't
> use Python 3? yet". The excuse that the negative commandment site is not
> part of the official site is does not wash. The #python site maintainer
> choose that as the authoritative word on the topic "On using Python 2.x or
> Python 3.x".
>
> Since a fair, half-intelligent person would know that the usability of
> Python3 depends on the user, this all strikes as conscious sabotage.
>
> To me, this, along with other reports, is really ugly. I do not wish to
> fight such people; but I would rather ask python3 questions in a pro- rather
> than anti-python3 atmosphere. #python is certainly not a place that I would
> refer new people to.
>
> Given that the 'owners' of #python have been asked and refuse to remove
> their negative-opinion-stated-as-leading-headline-fact, it seems to me that
> we need a separate #python3 channel. The topic could be "Welcome to
> discussion of Python3, the latest, greated version of Python." The first
> link might be to the current stable Python3 docs. Hence the '!' in the
> subject line.
>
> HoweverI have very little experience with IRC and consequently have little
> idea what getting a permanent, owned, channel like #python entails. Hence
> the '?' that follows.
>
> What do others think?

Seems like it turns a disagreement into a power struggle that python-dev
is unlikely to win. If people here were interested in the irc, the irc culture
would never have become as disconnected from the core group as it has,
and even the most impassioned call isn't going to build an active
community overnight. Furthermore, if #python has 200 people in it and
#python3 is a ghost town, they can just tell anybody asking a python3
question to go to #python3 and snicker, reinforcing the widely held belief
that python3 itself is a failure. It also runs the risk of hardening their
existing position, and in any event begins the process of fracturing the
community at a point where 3.x is probably not going to come out on top.

Bottom line, what I'd really like to do is kick them all off of #python, but
practically I see very little that can be done to rectify the situation at this
point.

Geremy Condra

From glyph at twistedmatrix.com  Sun Jun 20 02:56:38 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Sat, 19 Jun 2010 17:56:38 -0700
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
Message-ID: <A4A6C438-7CD9-44B8-942A-4A00A9E69A0D@twistedmatrix.com>

On Jun 19, 2010, at 5:39 PM, geremy condra wrote:

> Bottom line, what I'd really like to do is kick them all off of #python, but
> practically I see very little that can be done to rectify the situation at this
> point.

Here's something you can do: port libraries to python 3 and make the ecosystem viable.

It's as simple as that.  Nobody on #python has an ideological axe to grind, they just want to tell users to use tools which actually solve their problems.  (Well, unless you think that "helping users" is ideological axe-grinding, in which case I think you may want to re-examine your own premises.)

If Python 3 had all the features and libraries as Python 2, and ran in all the same places (for example, as Stephen Thorne reminded me when I asked him about this, the oldest supported version of Red Hat Enterprise Linux...) then it would be an equally viable answer on IRC.  It's going to take a lot of work to get it to that point.

Even if you write code, of course, it's too much work for one person to fill the whole gap.  Have some patience.  The PSF is funding these efforts, and more library authors are porting all the time.  Eventually, resistance in forums like Freenode's #python will disappear.  But you can't make it go away by wishing it away, you have to get rid of the cause.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100619/bfa1adac/attachment.html>

From raymond.hettinger at gmail.com  Sun Jun 20 03:12:35 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Sat, 19 Jun 2010 18:12:35 -0700
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
Message-ID: <7F841A1E-213A-40A0-8BAC-3CDEF9838E94@gmail.com>


On Jun 19, 2010, at 5:39 PM, geremy condra wrote:
> Bottom line, what I'd really like to do is kick them all off of #python,

This is so profoundly wrong on so many levels it is hard to know how to respond.


Raymond

From jacob at jacobian.org  Sun Jun 20 03:19:28 2010
From: jacob at jacobian.org (Jacob Kaplan-Moss)
Date: Sat, 19 Jun 2010 18:19:28 -0700
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <7F841A1E-213A-40A0-8BAC-3CDEF9838E94@gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com> 
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com> 
	<7F841A1E-213A-40A0-8BAC-3CDEF9838E94@gmail.com>
Message-ID: <AANLkTilXsEmnLxOabTajSrThDIjF9fWThJhALwPXReoO@mail.gmail.com>

On Sat, Jun 19, 2010 at 6:12 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> This is so profoundly wrong on so many levels it is hard to know how to respond.

C'mon, Raymond, that's not any more helpful.

Geremy wasn't trying to argue for that course of action; he was
expression his frustration with the culture that's developed in
#python. There's nothing wrong with frustration, and there's nothing
wrong with expressing those -- or any -- feelings. Indeed, I'm happy
that folks are blowing off a bit of steam here instead of doing
something silly in public.

Let's all try to simmer down here a little bit and cut each other some
slack: this is a frustration situation, and we're not going to help it
by heaping more fuel on the fire.

Jacob

From steve at pearwood.info  Sun Jun 20 04:04:30 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 20 Jun 2010 12:04:30 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <201006201204.30795.steve@pearwood.info>

On Sat, 19 Jun 2010 11:55:29 pm Stephen J. Turnbull wrote:

> If you want to do Python 3 a favor,
> make sure that they understand that Python 3 is *not* an "upgrade" of
> Python 2.
[...]
> Python 3 is a Python-2-like language, but even though it's built on
> the same design principles, and uses nearly identical syntax, there
> are fundamental differences.  And it is *very* young.  So it's a new
> language and should be approached in the same way as any new
> language.

I haven't written any large projects in Python3, so take this with a 
grain of salt, but I just don't see that Python3 is a "new language" as 
most people understand the term. It might be splitting hairs, but I see 
it as a new dialect *at worst*, and probably not even that, in the 
sense that any half decent human coder who can read Python 2.x code 
should be able to make sense of Python 3.x code, and vice versa.

As I see it, the changes to the language and syntax between 2.x and 3.x 
are much smaller than those between 1.x to 2.x:

Python 2.x introduced a brand new object model (new style classes). 
Python 3.x does not.

Python 2.x introduced radically new syntax, namely list comprehensions, 
while 3.x merely extends the same idea to set and dict comprehensions.

Python 2.x introduced lexical scoping AND closures. Python 3.x does 
nothing as radical.

Python 2.x introduced a new (to Python) programming model, namely 
iterators, complete with TWO extensions to syntax (generator functions 
including yield, generator expressions), *and* then went and made yield 
a function so as to introduce coroutines as well. Python 3.x merely 
uses iterators in more places.

Python 2.x introduced Unicode strings. Python 3.x merely makes them the 
default.

The only major difference is that Python 3 takes away as well as adding, 
but even there, Python 2 did the same, e.g. there is no provision to 
get the old scoping behaviour except to go back and use 2.1 or older.

Frankly, I believe that pushing the meme that "Python 3 is different" is 
a strategic mistake. People hate and fear change. I should know this. I 
resisted Python 2.x and stuck with 1.5 until Python 2.3 was released, 
and then was amazed at how *easy* the transition was. Of course, I 
wasn't using third party libraries that hadn't been ported to 2.3, if I 
had my experience would have been different. It's bad enough to have to 
tell people "Python 3 is currently lacking some critical libraries, 
particularly third-party libraries" without also telling them (wrongly 
IMO) "oh, and it's a new language too".


-- 
Steven D'Aprano

From steve at pearwood.info  Sun Jun 20 04:05:46 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Sun, 20 Jun 2010 12:05:46 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvijae$9tc$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>
Message-ID: <201006201205.46507.steve@pearwood.info>

On Sun, 20 Jun 2010 12:13:34 am Tres Seaver wrote:

> > I guess tutorial welcome, rather than patch welcome then ;)
>
> The only folks who can write the tutorial are the ones who have
> already drunk the koolaid.  Note that I've been making my living with
> Python for about twelve years now, and would *like* to use Python3,
> but can't, yet, and therefore haven't taken the first sip.

You emphatically say you would "like" to use Python3, but describe those 
who already have as having drunk the Koolaid. Comparing those who can 
and have successfully moved to Python3 with the Jonestown cult 
mass-suicide doesn't really strike me as a sign that you want to join 
them.


-- 
Steven D'Aprano

From guido at python.org  Sun Jun 20 04:21:35 2010
From: guido at python.org (Guido van Rossum)
Date: Sat, 19 Jun 2010 19:21:35 -0700
Subject: [Python-Dev] Year 0 and year 10,000 in timetuple
In-Reply-To: <A94EB78F-73B0-4A50-AEA1-F96615ADC0F5@gmail.com>
References: <AANLkTin7CycVLkMOLPfjZOkRpkXJy7IS-6qEfudQrkC-@mail.gmail.com> 
	<AANLkTims3IY7LoLFq8cfsjvuqBqzlOB6DNu0WcitqM4i@mail.gmail.com> 
	<A94EB78F-73B0-4A50-AEA1-F96615ADC0F5@gmail.com>
Message-ID: <AANLkTim-auvUEihu3iVGjn3SXdVzJhAheR2CHCuNDIGV@mail.gmail.com>

On Sat, Jun 19, 2010 at 3:31 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Jun 19, 2010, at 6:12 PM, Guido van Rossum <guido at python.org> wrote:
>> But what if they are used intentionally as "impossible" or sentinel
>> values?

> That would be another reason not to produce them accidently. ?Note that I am
> proposing disallowing production of out of range years from valid datetime
> objects, not consumption of them if that is allowed anywhere.

OK.

-- 
--Guido van Rossum (python.org/~guido)

From tjreedy at udel.edu  Sun Jun 20 04:44:49 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 19 Jun 2010 22:44:49 -0400
Subject: [Python-Dev] #Python3 ! ?
In-Reply-To: <A4A6C438-7CD9-44B8-942A-4A00A9E69A0D@twistedmatrix.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
	<A4A6C438-7CD9-44B8-942A-4A00A9E69A0D@twistedmatrix.com>
Message-ID: <hvjvb2$r3u$1@dough.gmane.org>

On 6/19/2010 8:56 PM, Glyph Lefkowitz wrote:
 > On Jun 19, 2010, at 5:39 PM, geremy condra wrote:
 >
 >> Bottom line, what I'd really like to do is kick them all off of
 >> #python, but
 >> practically I see very little that can be done to rectify the
 >> situation at this
 >> point.

Given the experiences you reported, I can understand that sentiment, but 
I explicitly disclaimed any intent to fight or power struggle.

 > Here's something you can do: port libraries to python 3 and make the
 > ecosystem viable.
 >
 > It's as simple as that. Nobody on #python has an ideological axe to
 > grind,

Then why are they grinding an anti-Python3 axe?

As I explained in my original post, I did not take anyone's word for it, 
but verified for myself that they are indeed doing so and why I thought so.

There are people who are opposed to Python3 and have the fantasy that if 
it fails, the devs would continue to pile new features, sometimes 
duplicative features into 2.x and never remove anything. They do not 
care that this would make the language harder and harder for new learners.

However, I will consider taking your claim at face value and, ignoring 
the insulting login message and site, try a Python3 question and see 
what response I get.

 > they just want to tell users to use tools which actually solve
 > their problems.

But that is not what they are doing. Python3 solved many of *my* 
problems with Python2, and there they are, commanding me and potential 
readers of my book-in-progress not to use it. If they wanted to help 
people make an intelligent choice between Python2 and Python3, they 
would point people to a discussion of the pros and cons of each. There 
have been several posted on python-list. Anyone who posted either "Do 
not use Python3" or "Do not use Python2" as a sweeping answer to a 
generic enquiry about 2 versus 3 might rightfully be blasted as a troll.

 > If Python 3 had all the features and libraries as Python 2,

Python3 has several features that Python2 does not. To me, nearly all 
the deletions and changes make the language better, much better, for 
*my* purposes. However, I am glad that the PSF exists to make all 
versions of Python available indefinitely for anyone who has need of 
them. I would not dream of saying "Python2: do not use it" to anyone 
except in response to a question about a specific problem solved in 
Python3 and not in Python2.

Terry Jan Reedy


From ncoghlan at gmail.com  Sun Jun 20 05:27:52 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 Jun 2010 13:27:52 +1000
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <1A151F99-EAA2-4897-88B5-B3E765AF66BD@ikanobori.jp>
References: <1A151F99-EAA2-4897-88B5-B3E765AF66BD@ikanobori.jp>
Message-ID: <AANLkTinGPXpQdSEcg8Vfr8CArOJfkEE33taUofh1MkL5@mail.gmail.com>

On Sun, Jun 20, 2010 at 5:55 AM, Simon de Vlieger <simon at ikanobori.jp> wrote:
> As for the potentially harmful text on Python 3 which is included on the
> python-commandments website I do get the hint that it might not be clear
> enough that the text does not apply to people who are porting libraries.
> This is a complaint I have heard before and to which I will take affirmative
> action by explicitly adding text to clarify that.

I just read that page, and I believe it could do with a little
refinement even from an application developer point of view.

Specifically, rather than "Why shouldn't I use it, yet?", a more
positive phrasing would be "Should I use it, yet?" or "Is Python 3
ready for me, yet?". And then suggest to app developers that they
check the status of Py3k support for libraries they need or think they
will need, as these days many of them will provide a 3.x compatible
version. Staying on 2.x for now is certainly a viable choice - there's
a reason that backports to 2.7 have been a prominent python-dev
activity for the last year or two. With that nearly out the door, the
focus will switch more to Py3k.

Cheers,
Nick.

P.S. wind the clock back 12 months or so, and I think the page as it
currently stands would have been perfectly good advice to app
developers.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Jun 20 06:05:07 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 Jun 2010 14:05:07 +1000
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilXsEmnLxOabTajSrThDIjF9fWThJhALwPXReoO@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
	<7F841A1E-213A-40A0-8BAC-3CDEF9838E94@gmail.com>
	<AANLkTilXsEmnLxOabTajSrThDIjF9fWThJhALwPXReoO@mail.gmail.com>
Message-ID: <AANLkTilCIwLwocH0PD3ectAq7wLsbOG-rRc2gDRMqUGX@mail.gmail.com>

On Sun, Jun 20, 2010 at 11:19 AM, Jacob Kaplan-Moss <jacob at jacobian.org> wrote:
> Let's all try to simmer down here a little bit and cut each other some
> slack: this is a frustration situation, and we're not going to help it
> by heaping more fuel on the fire.

The other thing to keep in mind is that there was a time when what the
#python folks are still saying *wasn't wrong*. Yes, their advice is
too negative for the situation as it stands now. But go back 12 or 18
months and their description would have been far more apt.

It sounds like they're happy to update the relevant pages to provide a
more balanced perspective now, and that's the important point.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From debatem1 at gmail.com  Sun Jun 20 07:39:35 2010
From: debatem1 at gmail.com (geremy condra)
Date: Sun, 20 Jun 2010 01:39:35 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <7F841A1E-213A-40A0-8BAC-3CDEF9838E94@gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTinpMXD9Ff-q2Oj6pgPxMk5bUzlKa0IX80YcgZLC@mail.gmail.com>
	<7F841A1E-213A-40A0-8BAC-3CDEF9838E94@gmail.com>
Message-ID: <AANLkTinThBkaQQo-FMwR_gu4zC-jM7CfAsmRvkuH9XfC@mail.gmail.com>

On Sat, Jun 19, 2010 at 9:12 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
> On Jun 19, 2010, at 5:39 PM, geremy condra wrote:
>> Bottom line, what I'd really like to do is kick them all off of #python,
>
> This is so profoundly wrong on so many levels it is hard to know how to respond.

Alright, so, yeah- I said it in the heat of the moment and shouldn't
have. I apologize.
I just hate having to explain to folks that don't know any better that
#python doesn't
represent the opinions of the people who actually develop python, and
I'm going to
STFU before I get sucked into this again.

Geremy Condra

From stephen at xemacs.org  Sun Jun 20 11:14:02 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 20 Jun 2010 18:14:02 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <201006201204.30795.steve@pearwood.info>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
Message-ID: <87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>

Steven D'Aprano writes:

 > Frankly, I believe that pushing the meme that "Python 3 is different" is 
 > a strategic mistake.

I agree that it's strategically undesirable.  Unfortunately, the
genuine backward incompatibility, as well as the huge mind-share
already garnered by what I consider wrong-headed advice from certain
quarters have made pushing the meme that "Python 3 is very nearly the
same" untenable.  It's hard to beat something like "it's not yet time
to use Python 3" with a nuanced explanation.

 > had my experience would have been different. It's bad enough to have to 
 > tell people "Python 3 is currently lacking some critical libraries, 
 > particularly third-party libraries" without also telling them (wrongly 
 > IMO) "oh, and it's a new language too".

That's why I propose the C to C++ analogy.  True, C++ does introduce a
lot of new features, but most programmers migrating from C to C++
don't learn to use them properly for years, if ever, I'm told.

Note also that I don't propose this as PSF advertising.  I proposed it
as a response to Mark's question, "what should I tell my readers?"


From solipsis at pitrou.net  Sun Jun 20 11:29:36 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 11:29:36 +0200
Subject: [Python-Dev] [OT] Re: email package status in 3.X
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>
	<201006201205.46507.steve@pearwood.info>
Message-ID: <20100620112936.7ae73935@pitrou.net>

On Sun, 20 Jun 2010 12:05:46 +1000
Steven D'Aprano <steve at pearwood.info> wrote:
> On Sun, 20 Jun 2010 12:13:34 am Tres Seaver wrote:
> 
> > > I guess tutorial welcome, rather than patch welcome then ;)
> >
> > The only folks who can write the tutorial are the ones who have
> > already drunk the koolaid.  Note that I've been making my living with
> > Python for about twelve years now, and would *like* to use Python3,
> > but can't, yet, and therefore haven't taken the first sip.
> 
> You emphatically say you would "like" to use Python3, but describe those 
> who already have as having drunk the Koolaid. Comparing those who can 
> and have successfully moved to Python3 with the Jonestown cult 
> mass-suicide doesn't really strike me as a sign that you want to join 
> them.

I have read the expression "drinking the Koolaid" more than once but I
didn't know it related to a mass-suicide at all. It changes my
comprehension of it quite a bit...

Regards

Antoine.


From solipsis at pitrou.net  Sun Jun 20 11:32:56 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 11:32:56 +0200
Subject: [Python-Dev] email package status in 3.X
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100620113256.7ba8d86a@pitrou.net>

On Sun, 20 Jun 2010 18:14:02 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> 
>  > had my experience would have been different. It's bad enough to have to 
>  > tell people "Python 3 is currently lacking some critical libraries, 
>  > particularly third-party libraries" without also telling them (wrongly 
>  > IMO) "oh, and it's a new language too".
> 
> That's why I propose the C to C++ analogy.

I think it's an unfortunate analogy. C++ needs new libraries (with
brand new APIs) to take advantage of its abstraction capabilities.
Python 3 has almost the same abstraction capabilities as Python 2, you
don't need to write new libraries: just port the existing ones.

> True, C++ does introduce a
> lot of new features, but most programmers migrating from C to C++
> don't learn to use them properly for years, if ever, I'm told.

I don't see how Python 3 has that problem. You can be productive here
and now in Python 3, re-using your knowledge of Python 2 with a bit of
added information.

Regards

Antoine.


From ben+python at benfinney.id.au  Sun Jun 20 12:31:30 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Sun, 20 Jun 2010 20:31:30 +1000
Subject: [Python-Dev] [OT] the Kool-Aid Acid Test (was: email package status
	in 3.X)
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>
	<201006201205.46507.steve@pearwood.info>
Message-ID: <87d3vmov0d.fsf_-_@benfinney.id.au>

Steven D'Aprano <steve at pearwood.info> writes:

> Comparing those who can and have successfully moved to Python3 with
> the Jonestown cult mass-suicide doesn't really strike me as a sign
> that you want to join them.

In my experience, many who refer to ?drinking the Kool-Aid? are not
referring to the Jonestown suicide cult, but rather to the earlier
Electric Kool-Aid Acid Test events of the psychedelic era
<URL:http://en.wikipedia.org/wiki/The_Electric_Kool_Aid_Acid_Test>.

-- 
 \       ?Whenever you read a good book, it's like the author is right |
  `\   there, in the room talking to you, which is why I don't like to |
_o__)                                   read good books.? ?Jack Handey |
Ben Finney


From lvh at laurensvh.be  Sun Jun 20 12:35:24 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 12:35:24 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvjlpt$8pe$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
Message-ID: <AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>

Hello,


I'm one of the active people in #python that some people dislike for
behavior with respect to Python 3.

First of all I'd like to defuse the situation, much like Jacob.
Seriously. It's been a bunch of posts so far and most of them have
been pretty angry. Let's take a deep breath and try to fix the
situation that's getting people frustrated like grownups :-) (FWIW: I
find being called worse than half-intelligent pretty offensive. Let's
stop doing that?)

The idea being expressed in the IRC topic is _way_ bigger than the
room an IRC topic gives you. Yes, it's an imperfect medium, yes, it's
probably partially based on the use case: it's just that experience
leads us to believe that the vast majority of use cases ends up being
more in 2.x turf then 3.x turf, at the very least in the past.

I'm sorry if you had the impression people wanted to nail you at the
stake for using Python 3. If that's how you felt, it isn't true. I
basically agree with Glyph. I don't think we've recently (I'm not
omnipresent) told anyone who had any good reasons to to stop using
Python 3. If someone's doing work that actually needs Python 3 (most
recent example a GSOC student porting Sphinx), we try our best to
help, and AFAICT we've mostly been successful. (Please correct me if
you think this is erroneous.). We don't get too many people that
actually want or need that, but I'm guessing that's mostly because
people porting libraries to py3k usually already know what they're
doing so they don't need the first-line-of-defense thing for Python
questions that #python tries to be.

Maybe you disagree on what good reasons are. #python is a bunch of
volunteers giving help, free of charge, which is usually of a pretty
high standard because they're professional Python developers and have
been for a long time. Maybe that biases some of us against Py3k? Fact
remains that there's a bunch of active people on IRC who pour a lot of
time and effort into #python and make a lot of newbies really happy,
and I think the picture you're painting based on a single issue that
clearly not everyone agrees on is a bit disrespectful and somewhat
unfair.

Also, I'm pretty sure nobody has ever said that Python 3.x was a
"failure", or anything like it. #python has claims that Python 3.x, as
a platform for building production apps, is a work in progress because
of third party library support, and the language itself is pretty much
done and okay -- a cleaner version of 2.x. People ask why it's too
early to use Py3k, and that's _always_ the answer they get: at least
the first half, and usually the second half too.

In the mean while, we encourage people to write code that will be easy
to port and behave well in 3.x: new-style classes, don't use eager
versions when the Py3k default is lazy and you don't actually need the
eager thing, use as many third party libraries as possible (the idea
being that this would minimize effort needed to make the switch on the
grand scale of things), use absolute imports always (and only explicit
relative, but it's discouraged), always have a full unit test suite.
This is advice that generally makes a lot of sense, and it's the
recommended thing in PEP 3000 for porting to 3.x as well.

We're still telling people to use Python 2.x by default because of a
few major things:

1. going out on a limb here: well over 90% of those people are
completely new to Python and out of those most of them completely new
to programming too,
2. the nicest libraries for doing a lot of stuff aren't ported yet, or
are in the process of being ported but not yet recommended for actual
use by their authors, (this seems to be a point of contention?)
3. we know how to help people better with it

Which are all basically different incarnations of the same issue.
People are working on libraries everywhere and I really don't want to
pretend those people haven't gotten any work done, but AFAICT a lot of
these for existing mature projects that you'd want people to use in
order to be happy productive Python users don't really exist yet or
are at best experimental. At the very least I think most people can
agree that 2.x is still the default release for existing, mature
software projects and most new ones too.

I can only speak for my own area of intrest: Python is way too big for
anyone to have used every piece of software for it ever. I,
personally, don't use 3.x because I develop for PyS60 devices,
PythonCE devices (2.5 only), and Twisted servers (2.6), and none of
those work on 3.x yet. The other thing we build is websites, and AFAIK
the web situation, for now, is still "use python2.x", too? (for any
non-trivial website, of course). We use AMQP, and the best thing we've
found for it is 2.x only (maybe Carrot and Pika do 3.x now, but I
can't find any evidence of it). Nobody here (here = place of business)
hates Python 3. We just can't use it.

I'm very sorry if you've been offended. Like Glyph said: we're not
grinding ideological axes. We're just recommending what we honestly
believe is the right tool for the job. We're just humans, we're not
perfect. We make mistakes. If you feel we've made them, please just
tell us and don't start a war. If you tried and failed, please feel
free to tell me how (doesn't have to be in public) and why it failed,
and maybe I (or someone else!) can try to fix it: that's *not* how
stuff is supposed to happen. Maybe someone was being a troll, I
haven't checked but I trust the people I run #python with enough to
say that it probably wasn't a regular. That's IRC for you: the problem
is that if you let everyone speak once in a while trolls open their
mouths. Perhaps something someone said was just taken too seriously. I
don't know the situation you're referring to, I just know #python.

Again, just because someone asked and nobody removed that line ('It's
too early...') doesn't mean we're evil pricks that want people to use
Python2.x because of some hidden agenda. It just means that person
disagrees with the idea that it's a good time to start doing it. IRC
can be a harsh place, not because the people are jerks but because the
medium just lends itself to it. People are generally a not nicer than
they appear.

Like Nick said: not too long ago this was perfectly sound advice. I'm
convinced it still is; maybe I've (and a lot of people active on
#python) been out of touch with recent evolutions and it's no longer
true. I don't know. I'm just a bit sad that it had to come angry
ventings (no grudges, I realise most of it is probably just
frustration). I like to think I'm not wrong when I think that if
people just ask "Hey, guys, this Python 3.x rule, don't you think it's
about time we reviewed that? It's been up for a long while." people
would get banned or anything. Maybe people disagree and think it
should still be up there: but at least we could have a productive
discussion hopefully resulting in something that makes everyone happy
or at the very least less frustrated. I just asked a two regulars and
despite the fact that we're about as widespread as we possibly could
timezone wise (SE Asia; Western Europe; WA, USA) nobody remembers that
happening.

Also, on tiwsted Twisted: yes, #python is very Twisted-minded, we have
a bunch of people that like it, develop it, have built cool software
with it and we think it could help other people too. It's not
ideological axe grinding: a lot of the regulars just genuinely like
Twisted. I'm sorry if you felt that not liking Twisted was going to
get you smacked across the face, but that's not true either: Ronny
Pfannschmidt is a regular, and he really doesn't like Twisted. We just
think that for a lot of questions people come in with, Twisted is a
great solution. That doesn't mean you're not allowed to have contrary
opinions or that all dissent is crushed with an iron fist: it just
means that the people who actually bother to help others day in day
out know Twisted, like Twisted, and think Twisted is a great tool for
a lot of problems. If you don't like Twisted, feel free to use
something else: just don't complain when nobody can help you because
the people offering help are all Twisted users that don't understand
your software and don't have time or incentive to. It's a purely
pragmatic thing. There's no hidden agenda.

I've put bits of this up for review to #python regulars, so when I say
'we' it usually does mean 'we, #python regulars'. Most of it
resonates. Maybe we're just in the distortion field?


thanks for listening,
Laurens

From ncoghlan at gmail.com  Sun Jun 20 12:57:56 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 Jun 2010 20:57:56 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100620113256.7ba8d86a@pitrou.net>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100620113256.7ba8d86a@pitrou.net>
Message-ID: <AANLkTilG77MOap_WsPhawv93PEIYLhnlaIEmrM39LE_M@mail.gmail.com>

On Sun, Jun 20, 2010 at 7:32 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
>> True, C++ does introduce a
>> lot of new features, but most programmers migrating from C to C++
>> don't learn to use them properly for years, if ever, I'm told.
>
> I don't see how Python 3 has that problem. You can be productive here
> and now in Python 3, re-using your knowledge of Python 2 with a bit of
> added information.

Yeah, the significant issues with Python 3 over Python 2 *only* apply
to people with legacy Python 2 code to worry about. The one thing that
makes Python 3 potentially less desirable than Python 2 for some new
applications is that the third party library support isn't quite as
good yet. As more of the "big" libraries and frameworks provide Python
3 compatible versions, that factor will go away.

As far as I can tell, with 3 years still to go on my own original
prediction of 5+ years for Python 3 to start to be competitive with
Python 2 for programming mindshare, adoption actually seems to be
progressing fairly well. A lot of key functionality is either already
supported in Python 3 or will be soon, and a lot of the rest is at
least talking about plans for Python 3 compatibility. It's just that 5
years can seem like an eternity in the internet age, so sometimes
people see the relative lack of adoption of Python 3 at this stage and
start to panic about it being a failure.

Now, if we're still having this conversation in 2013, then I'll admit
we have a problem with the Python 3 uptake rate ;)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From fuzzyman at voidspace.org.uk  Sun Jun 20 13:24:59 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sun, 20 Jun 2010 12:24:59 +0100
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
Message-ID: <73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>


On 20 Jun 2010, at 11:35, Laurens Van Houtven <lvh at laurensvh.be> wrote:

> Hello,
>
>
>
> I'm one of the active people in #python that some people dislike for
> behavior with respect to Python 3.
>
> First of all I'd like to defuse the situation, much like Jacob.
> Seriously. It's been a bunch of posts so far and most of them have
> been pretty angry. Let's take a deep breath and try to fix the
> situation that's getting people frustrated like grownups :-) (FWIW: I
> find being called worse than half-intelligent pretty offensive. Let's
> stop doing that?)
>
> The idea being expressed in the IRC topic is _way_ bigger than the
> room an IRC topic gives you.

Hey Laurens - I don't have an issue with with anything you've said,  
but given the topic is far more nuanced than an IRC topic can express  
maybe that just isn't the right place for it.

Michael


> Yes, it's an imperfect medium, yes, it's
> probably partially based on the use case: it's just that experience
> leads us to believe that the vast majority of use cases ends up being
> more in 2.x turf then 3.x turf, at the very least in the past.
>
> I'm sorry if you had the impression people wanted to nail you at the
> stake for using Python 3. If that's how you felt, it isn't true. I
> basically agree with Glyph. I don't think we've recently (I'm not
> omnipresent) told anyone who had any good reasons to to stop using
> Python 3. If someone's doing work that actually needs Python 3 (most
> recent example a GSOC student porting Sphinx), we try our best to
> help, and AFAICT we've mostly been successful. (Please correct me if
> you think this is erroneous.). We don't get too many people that
> actually want or need that, but I'm guessing that's mostly because
> people porting libraries to py3k usually already know what they're
> doing so they don't need the first-line-of-defense thing for Python
> questions that #python tries to be.
>
> Maybe you disagree on what good reasons are. #python is a bunch of
> volunteers giving help, free of charge, which is usually of a pretty
> high standard because they're professional Python developers and have
> been for a long time. Maybe that biases some of us against Py3k? Fact
> remains that there's a bunch of active people on IRC who pour a lot of
> time and effort into #python and make a lot of newbies really happy,
> and I think the picture you're painting based on a single issue that
> clearly not everyone agrees on is a bit disrespectful and somewhat
> unfair.
>
> Also, I'm pretty sure nobody has ever said that Python 3.x was a
> "failure", or anything like it. #python has claims that Python 3.x, as
> a platform for building production apps, is a work in progress because
> of third party library support, and the language itself is pretty much
> done and okay -- a cleaner version of 2.x. People ask why it's too
> early to use Py3k, and that's _always_ the answer they get: at least
> the first half, and usually the second half too.
>
> In the mean while, we encourage people to write code that will be easy
> to port and behave well in 3.x: new-style classes, don't use eager
> versions when the Py3k default is lazy and you don't actually need the
> eager thing, use as many third party libraries as possible (the idea
> being that this would minimize effort needed to make the switch on the
> grand scale of things), use absolute imports always (and only explicit
> relative, but it's discouraged), always have a full unit test suite.
> This is advice that generally makes a lot of sense, and it's the
> recommended thing in PEP 3000 for porting to 3.x as well.
>
> We're still telling people to use Python 2.x by default because of a
> few major things:
>
> 1. going out on a limb here: well over 90% of those people are
> completely new to Python and out of those most of them completely new
> to programming too,
> 2. the nicest libraries for doing a lot of stuff aren't ported yet, or
> are in the process of being ported but not yet recommended for actual
> use by their authors, (this seems to be a point of contention?)
> 3. we know how to help people better with it
>
> Which are all basically different incarnations of the same issue.
> People are working on libraries everywhere and I really don't want to
> pretend those people haven't gotten any work done, but AFAICT a lot of
> these for existing mature projects that you'd want people to use in
> order to be happy productive Python users don't really exist yet or
> are at best experimental. At the very least I think most people can
> agree that 2.x is still the default release for existing, mature
> software projects and most new ones too.
>
> I can only speak for my own area of intrest: Python is way too big for
> anyone to have used every piece of software for it ever. I,
> personally, don't use 3.x because I develop for PyS60 devices,
> PythonCE devices (2.5 only), and Twisted servers (2.6), and none of
> those work on 3.x yet. The other thing we build is websites, and AFAIK
> the web situation, for now, is still "use python2.x", too? (for any
> non-trivial website, of course). We use AMQP, and the best thing we've
> found for it is 2.x only (maybe Carrot and Pika do 3.x now, but I
> can't find any evidence of it). Nobody here (here = place of business)
> hates Python 3. We just can't use it.
>
> I'm very sorry if you've been offended. Like Glyph said: we're not
> grinding ideological axes. We're just recommending what we honestly
> believe is the right tool for the job. We're just humans, we're not
> perfect. We make mistakes. If you feel we've made them, please just
> tell us and don't start a war. If you tried and failed, please feel
> free to tell me how (doesn't have to be in public) and why it failed,
> and maybe I (or someone else!) can try to fix it: that's *not* how
> stuff is supposed to happen. Maybe someone was being a troll, I
> haven't checked but I trust the people I run #python with enough to
> say that it probably wasn't a regular. That's IRC for you: the problem
> is that if you let everyone speak once in a while trolls open their
> mouths. Perhaps something someone said was just taken too seriously. I
> don't know the situation you're referring to, I just know #python.
>
> Again, just because someone asked and nobody removed that line ('It's
> too early...') doesn't mean we're evil pricks that want people to use
> Python2.x because of some hidden agenda. It just means that person
> disagrees with the idea that it's a good time to start doing it. IRC
> can be a harsh place, not because the people are jerks but because the
> medium just lends itself to it. People are generally a not nicer than
> they appear.
>
> Like Nick said: not too long ago this was perfectly sound advice. I'm
> convinced it still is; maybe I've (and a lot of people active on
> #python) been out of touch with recent evolutions and it's no longer
> true. I don't know. I'm just a bit sad that it had to come angry
> ventings (no grudges, I realise most of it is probably just
> frustration). I like to think I'm not wrong when I think that if
> people just ask "Hey, guys, this Python 3.x rule, don't you think it's
> about time we reviewed that? It's been up for a long while." people
> would get banned or anything. Maybe people disagree and think it
> should still be up there: but at least we could have a productive
> discussion hopefully resulting in something that makes everyone happy
> or at the very least less frustrated. I just asked a two regulars and
> despite the fact that we're about as widespread as we possibly could
> timezone wise (SE Asia; Western Europe; WA, USA) nobody remembers that
> happening.
>
> Also, on tiwsted Twisted: yes, #python is very Twisted-minded, we have
> a bunch of people that like it, develop it, have built cool software
> with it and we think it could help other people too. It's not
> ideological axe grinding: a lot of the regulars just genuinely like
> Twisted. I'm sorry if you felt that not liking Twisted was going to
> get you smacked across the face, but that's not true either: Ronny
> Pfannschmidt is a regular, and he really doesn't like Twisted. We just
> think that for a lot of questions people come in with, Twisted is a
> great solution. That doesn't mean you're not allowed to have contrary
> opinions or that all dissent is crushed with an iron fist: it just
> means that the people who actually bother to help others day in day
> out know Twisted, like Twisted, and think Twisted is a great tool for
> a lot of problems. If you don't like Twisted, feel free to use
> something else: just don't complain when nobody can help you because
> the people offering help are all Twisted users that don't understand
> your software and don't have time or incentive to. It's a purely
> pragmatic thing. There's no hidden agenda.
>
> I've put bits of this up for review to #python regulars, so when I say
> 'we' it usually does mean 'we, #python regulars'. Most of it
> resonates. Maybe we're just in the distortion field?
>
>
> thanks for listening,
> Laurens
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk

From lvh at laurensvh.be  Sun Jun 20 13:33:35 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 13:33:35 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>
Message-ID: <AANLkTikLBcYqyvxkd8DimEIHhv6p2TYjmytSvstwPs69@mail.gmail.com>

Michael,


Fair point! It's mostly put in the topic so people can ask about it
and we can give them more detailed answers, because, as other people
have mentioned, the exact answer depends largely on what *precisely*
someone is doing.

I'm not sure what sort of an effect it would have if we took it out.
Maybe something we could try? I'm not sure it'd have much of a
practical effect since most of the regulars expertise isn't going to
shift instantly, so getting actual help is probably going to be a bit
rough on 3.x users.

At the very least I'm going to take this suggestion to #python's
regulars and see what they have to say about it :-)

(One of the problems people I've talked to in private that were
"pretty miffed" about is the dissonance between #python and
python-dev, and that there's some problem with people assuming things
said on #python as being very authoritative answers (ha ha). I think
this is really bad for Python as a whole and I've love to hear ideas
on how you guys think it could be fixed.)


thanks
Laurens

From g.rodola at gmail.com  Sun Jun 20 14:26:28 2010
From: g.rodola at gmail.com (=?ISO-8859-1?Q?Giampaolo_Rodol=E0?=)
Date: Sun, 20 Jun 2010 14:26:28 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <201006201204.30795.steve@pearwood.info>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
Message-ID: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>

2010/6/20 Steven D'Aprano <steve at pearwood.info>:
> Python 2.x introduced Unicode strings. Python 3.x merely makes them the
> default.

"Merely"? To me this looks as the main reason why a lot of projects
haven't been ported to Python 3 yet.
I attempted to port pyftpdlib to python 3 several times and the
biggest show stopper has always been the bytes / string difference
introduced by Python 3 which forces you to *know* and *use* Unicode
every time you deal with some text and 2to3 is completely useless
here.
I can only imagine how difficult can it be to do such a conversion in
a project like Twisted or Django where the I/O plays a fundamental
role.

The choice of forcing the user to use Unicode and "think in Unicode"
was a very brave one, and I'm sure it's for the better, but not
everyone wants to deal with that because Unicode is hard to swallow.
The majority of people prefer to stay with bytes and eventually learn
and introduce Unicode only when that is actually needed.


--- Giampaolo
http://code.google.com/p/pyftpdlib
http://code.google.com/p/psutil

From ncoghlan at gmail.com  Sun Jun 20 14:30:08 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 20 Jun 2010 22:30:08 +1000
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTikLBcYqyvxkd8DimEIHhv6p2TYjmytSvstwPs69@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>
	<AANLkTikLBcYqyvxkd8DimEIHhv6p2TYjmytSvstwPs69@mail.gmail.com>
Message-ID: <AANLkTil_GudWYVPuSjiLAImBr1xy86mfmOG5xHYHdSiD@mail.gmail.com>

On Sun, Jun 20, 2010 at 9:33 PM, Laurens Van Houtven <lvh at laurensvh.be> wrote:
> I'm not sure what sort of an effect it would have if we took it out.
> Maybe something we could try? I'm not sure it'd have much of a
> practical effect since most of the regulars expertise isn't going to
> shift instantly, so getting actual help is probably going to be a bit
> rough on 3.x users.

Given the number of other links that are already in the status
message, it would be really nice if the comment could be updated to
something like:

"Is Python3 ready for me? http://python-commandments.org/python3.html"

i.e. make it clear that this is a question where the answer will vary
based on your use case, and provide a clear direction on where to get
more information.

That page could then be updated to give a more balance view of the
pros of Python 3 (e.g. cleaner core language design, future direction
of the language, much better Unicode support) and the pros of Python 2
(e.g. wider installed base, better current third party library
support, greater existing developer base, larger support ecosystem,
greater #python expertise)

> (One of the problems people I've talked to in private that were
> "pretty miffed" about is the dissonance between #python and
> python-dev, and that there's some problem with people assuming things
> said on #python as being very authoritative answers (ha ha). I think
> this is really bad for Python as a whole and I've love to hear ideas
> on how you guys think it could be fixed.)

There are always going to be differences in how different communities
see the world and even the "Python community" is far too large to have
a consistent point of view on almost any topic. So we'll likely have
to muddle through with various ideas slowly percolating through to
different parts of the community. That said, keeping in touch with the
#python crew is certainly something we haven't paid much attention to
in the past, but is probably just as important as staying in touch
with major library developers and the developers of other
implementations.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From solipsis at pitrou.net  Sun Jun 20 14:51:40 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 14:51:40 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>
	<AANLkTikLBcYqyvxkd8DimEIHhv6p2TYjmytSvstwPs69@mail.gmail.com>
Message-ID: <20100620145140.68f22791@pitrou.net>

On Sun, 20 Jun 2010 13:33:35 +0200
Laurens Van Houtven <lvh at laurensvh.be> wrote:
> 
> (One of the problems people I've talked to in private that were
> "pretty miffed" about is the dissonance between #python and
> python-dev, and that there's some problem with people assuming things
> said on #python as being very authoritative answers (ha ha). I think
> this is really bad for Python as a whole and I've love to hear ideas
> on how you guys think it could be fixed.)

Perhaps lower the tone a bit on http://pound-python.org/ ?
?foremost support system for developing quality Python
applications? ... ?crack team of Python experts? ... ?Your time won't
be wasted by architecture astronauts or trivial repetitions of the
docs?.

(I understand these are slightly tongue-in-cheek but, if this page is
intented mainly for beginners, I think being descriptive is more
valuable)

Also, mention other support options there - primarily comp.lang.python,
of course, and the official documentation pages.

Regards

Antoine.


From N.D.Efford at leeds.ac.uk  Sun Jun 20 15:08:15 2010
From: N.D.Efford at leeds.ac.uk (Nick Efford)
Date: Sun, 20 Jun 2010 14:08:15 +0100 (BST)
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <mailman.6486.1277033057.32708.python-dev@python.org>
References: <mailman.6486.1277033057.32708.python-dev@python.org>
Message-ID: <alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>

> I'm sorry if you had the impression people wanted to nail you at the
> stake for using Python 3. If that's how you felt, it isn't true. I
> basically agree with Glyph. I don't think we've recently (I'm not
> omnipresent) told anyone who had any good reasons to to stop using
> Python 3. If someone's doing work that actually needs Python 3 (most
> recent example a GSOC student porting Sphinx), we try our best to
> help, and AFAICT we've mostly been successful. (Please correct me if
> you think this is erroneous.). We don't get too many people that
> actually want or need that, but I'm guessing that's mostly because
> people porting libraries to py3k usually already know what they're
> doing so they don't need the first-line-of-defense thing for Python
> questions that #python tries to be.

Thanks for explaining your position on this so carefully,
Laurens.  You've made many reasonable points which I hope will
help to cool things down a little.

Clearly, there are situations where it makes sense to advocate
Python 2.X and other situations where people can be encouraged to
consider Python 3.  The issues that potential users need to
consider are too subtle to be represented fairly by the simple
advice to 'avoid Python 3', so can we not all agree to remove
it as a #python topic as a gesture of goodwill?  Nobody need
change their opinions or adovacy as a result, but it would have
the benefit of presenting #python in a more neutral and inclusive
light.

I've not used IRC much in the past, but if it would be useful for
someone like myself - a longtime Python user but recent and
enthusiastic Python 3 adopter - to offer my opinions and advice
on the issue to newcomers then I'm certainly willing to get
involved.

> We're still telling people to use Python 2.x by default because of a
> few major things:
>
> 1. going out on a limb here: well over 90% of those people are
> completely new to Python and out of those most of them completely new
> to programming too,

Not sure if I agree with you here; I regard people new to
programming as the prime candidates for using Python 3.  Many of
the language changes have the effect of making it significantly
easier to learn for newcomers (I wrote about this a while ago -
see http://www.comp.leeds.ac.uk/nde/papers/teachpy3.html).
Also, people new to Python or programming in general won't have
the burden of legacy code that needs to be converted.

The only situation in which I'd direct someone new to programming
away from Python 3 would be if they had a specific need to use a
library that wasn't yet supported.

> 2. the nicest libraries for doing a lot of stuff aren't ported yet, or
> are in the process of being ported but not yet recommended for actual
> use by their authors, (this seems to be a point of contention?)

This has certainly been the key issue for me.  Only in the past
two or three months have we got to the point where I feel can commit
to Python 3 fully.  Six months ago, I definitely could not have
done so.  This is progress, and we need to be positive about it.

Regards,


Nick

-- 
Dr Nick Efford, School of | E: N.D.Efford at leeds.ac.uk
Computing, University of  | T: +44 113 343 6809
Leeds, Leeds, LS2 9JT, UK | W: http://www.comp.leeds.ac.uk/nde/
--------------------------+-----------------------------------------
PGP fingerprint: 6ADF 16C2 4E2D 320B F537  8F3C 402D 1C78 A668 8492

From lvh at laurensvh.be  Sun Jun 20 16:46:03 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 16:46:03 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTil_GudWYVPuSjiLAImBr1xy86mfmOG5xHYHdSiD@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>
	<AANLkTikLBcYqyvxkd8DimEIHhv6p2TYjmytSvstwPs69@mail.gmail.com>
	<AANLkTil_GudWYVPuSjiLAImBr1xy86mfmOG5xHYHdSiD@mail.gmail.com>
Message-ID: <AANLkTikzi2pyFv4_s0fniuDPZFUfcGe2Phkv-DRFzFrj@mail.gmail.com>

On Sun, Jun 20, 2010 at 2:30 PM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Sun, Jun 20, 2010 at 9:33 PM, Laurens Van Houtven <lvh at laurensvh.be> wrote:
> Given the number of other links that are already in the status
> message, it would be really nice if the comment could be updated to
> something like:
>
> "Is Python3 ready for me? http://python-commandments.org/python3.html"

Sounds like a great idea, I'll run it past the other folks.

> i.e. make it clear that this is a question where the answer will vary
> based on your use case, and provide a clear direction on where to get
> more information.

I think the reason #python regulars never saw this as a problem is
because people who actually ask do get this answer. At least they do
if Aaron, Allen, Brendon, Clovis, Stephen, Devin, me... (list of names
way too numerous to be exhaustive) are awake. Maybe the strong
language does scare people off from that critical
asking-for-more-information step, so yes, reviewing that would be a
good idea.

> There are always going to be differences in how different communities
> see the world and even the "Python community" is far too large to have
> a consistent point of view on almost any topic. So we'll likely have
> to muddle through with various ideas slowly percolating through to
> different parts of the community. That said, keeping in touch with the
> #python crew is certainly something we haven't paid much attention to
> in the past, but is probably just as important as staying in touch
> with major library developers and the developers of other
> implementations.

My thoughts exactly on both counts. Communication good, embrace heterogeneity :)

> Cheers,
> Nick.
>

Thanks for your input,
Laurens

From lvh at laurensvh.be  Sun Jun 20 16:50:28 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 16:50:28 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <20100620145140.68f22791@pitrou.net>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<73C61D25-CDCC-498E-B1EA-BDA608E2EDEC@voidspace.org.uk>
	<AANLkTikLBcYqyvxkd8DimEIHhv6p2TYjmytSvstwPs69@mail.gmail.com>
	<20100620145140.68f22791@pitrou.net>
Message-ID: <AANLkTilJrBQrkjH4S6VBLvYA7d6ehxSxIqoOi5rOt0zo@mail.gmail.com>

On Sun, Jun 20, 2010 at 2:51 PM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sun, 20 Jun 2010 13:33:35 +0200
> Laurens Van Houtven <lvh at laurensvh.be> wrote:
> Perhaps lower the tone a bit on http://pound-python.org/ ?
> ?foremost support system for developing quality Python
> applications? ... ?crack team of Python experts? ... ?Your time won't
> be wasted by architecture astronauts or trivial repetitions of the
> docs?.

Noted, we'll say the same thing but differently.

> (I understand these are slightly tongue-in-cheek but, if this page is
> intented mainly for beginners, I think being descriptive is more
> valuable)

Yes, it is tongue-in-cheek, but perhaps a bit too much so :-) I didn't
write it, it just never struck me as a problem at the time. I think
the problem is that that page was created to fix a very specific
problem (explaining why #python isn't a search engine), and it
probably got written more out of something snapping than an attempt to
be informative.

> Also, mention other support options there - primarily comp.lang.python,
> of course, and the official documentation pages.

Will do.

> Regards
>
> Antoine.

Thanks for your input,
Laurens

From ncoghlan at gmail.com  Sun Jun 20 16:58:59 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Jun 2010 00:58:59 +1000
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>
References: <mailman.6486.1277033057.32708.python-dev@python.org>
	<alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>
Message-ID: <AANLkTinjFOQt7aybQbU4OJx8wFPHY42L9HtkHWauHbU0@mail.gmail.com>

On Sun, Jun 20, 2010 at 11:08 PM, Nick Efford <N.D.Efford at leeds.ac.uk> wrote:
> Not sure if I agree with you here; I regard people new to
> programming as the prime candidates for using Python 3. ?Many of
> the language changes have the effect of making it significantly
> easier to learn for newcomers (I wrote about this a while ago -
> see http://www.comp.leeds.ac.uk/nde/papers/teachpy3.html).

That's actually one of the better write-ups I've seen regarding
several of the key benefits of the Python 3 transition. They're easy
to lose sight of when discussing the topic with the existing
developers that are bearing the cost of converting their code due to
changes that were made primarily for the benefit of new users.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at holdenweb.com  Sun Jun 20 17:37:53 2010
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Jun 2010 00:37:53 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>
Message-ID: <hvlcl8$cjv$1@dough.gmane.org>

Glyph Lefkowitz wrote:
> On Jun 19, 2010, at 5:02 PM, Terry Reedy wrote:
> 
>> HoweverI have very little experience with IRC and consequently have
>> little idea what getting a permanent, owned, channel like #python
>> entails. Hence the '?' that follows.
>>
>> What do others think?
> 
> Sure, this is a good idea.
> 
> Technically speaking, this is extremely easy.  Somebody needs to "/msg
> chanserv register #python3" and that's about it.  (In this case, that
> "someone" may need to be Brett Cannon, since he is the official group
> contact for Freenode regarding Python-related channels.)
> 
> Practically speaking, you will need a group of at least a dozen
> contributors, each in a different timezone, who sit there all day
> answering questions :).  Otherwise the ownership of the channel is just
> a signpost pointing at an empty room.
> 
Which is yet another reason I don't think it would be productive to
attempt any kind of pre-emptive action against the #python team. They do
serve a very useful purpose, and there is reasoned logic behind their
position even if we might wish it were different.

regards
 Steve

-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From stephen at thorne.id.au  Sun Jun 20 17:38:33 2010
From: stephen at thorne.id.au (Stephen Thorne)
Date: Mon, 21 Jun 2010 01:38:33 +1000
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
In-Reply-To: <AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
Message-ID: <20100620153833.GD20639@thorne.id.au>

On 2010-06-19, Arc Riley wrote:
> You mean Twisted support,

No. I don't.

Often, on #python, we get the situation where someone approaches us saying, "I
have this problem in my python code, why does this not work for me?" and
usually very quickly we establish the programmer has followed a tutorial or
attempted to use a library that depends on python 2, but the programmer is
running python 3.

Queried on why they are using python 3, the answer is frequently, "Because I
downloaded the latest version."

For those people, we believe it is too early to use python 3. When talking to
these people with a world view of "why shouldn't i use the latest version"
having a concrete preexisting statement in the topic we can point to is
invaluable.

We don't always ask those who are having python 3 problems to go to python2.
Often we simply explain about all strings bring unicode or print now being a
function, and the conversation dies.

There are also programmers who definately should be using python 3 for their
work. They know who they are. They do receive support in #python.

--

In writing this email to python-dev, I have reviewed my logs of #python
specifically looking for the phrase 'python 3'. Here are some packages that
were named in the conversations:

 - py2exe
 - cx_Freeze
 - twisted 
 - PIL
 - ctypes
 - email

I present this list because they are what programmers are coming to #python to
ask about, and that may be relevent to your discussion about python 3 ports.

-- 
Regards,
Stephen Thorne

From lvh at laurensvh.be  Sun Jun 20 17:51:28 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 17:51:28 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>
References: <mailman.6486.1277033057.32708.python-dev@python.org>
	<alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>
Message-ID: <AANLkTimWglB2_nAFoHGtwtAG9vUCUquoP_LVAmnupKcY@mail.gmail.com>

On Sun, Jun 20, 2010 at 3:08 PM, Nick Efford <N.D.Efford at leeds.ac.uk> wrote:
> Thanks for explaining your position on this so carefully,
> Laurens. ?You've made many reasonable points which I hope will
> help to cool things down a little.

Cool, glad it's appreciated.

> Clearly, there are situations where it makes sense to advocate
> Python 2.X and other situations where people can be encouraged to
> consider Python 3. ?The issues that potential users need to
> consider are too subtle to be represented fairly by the simple
> advice to 'avoid Python 3', so can we not all agree to remove
> it as a #python topic as a gesture of goodwill?

I like the idea of changing it to something that points to a more
detailed thing as someone suggested above. Ideally short and
completely neutral, like "2.x or 3.x? http://shorturl/whatever".

>?Nobody need change their opinions or adovacy as a result,

I very much doubt that'd happen anyway ;-)

> but it would have the benefit of presenting #python in a more
> neutral and inclusive light.

+1

> I've not used IRC much in the past, but if it would be useful for
> someone like myself - a longtime Python user but recent and
> enthusiastic Python 3 adopter - to offer my opinions and advice
> on the issue to newcomers then I'm certainly willing to get
> involved.

Everybody's very welcome, the entire reason I'm putting time into this
is because apparently some people felt less welcome than I'd like them
to feel :-)

>> We're still telling people to use Python 2.x by default because of a
>> few major things:
>>
>> 1. going out on a limb here: well over 90% of those people are
>> completely new to Python and out of those most of them completely new
>> to programming too,
>
> Not sure if I agree with you here; I regard people new to
> programming as the prime candidates for using Python 3. ?Many of
> the language changes have the effect of making it significantly
> easier to learn for newcomers (I wrote about this a while ago -
> see http://www.comp.leeds.ac.uk/nde/papers/teachpy3.html).
> Also, people new to Python or programming in general won't have
> the burden of legacy code that needs to be converted.

Very nice read. Most points are indeed common questions, we just tell
people how to work around them in 2.x. ie, whenever someone posts
old-style classes, someone will always point out to them that they
really probably want new-style even if they don't get the difference
yet; for integer division we tell people to convert to float or from
__future__ import division, if you use print call it with exactly one
string and just build that string, never ever ever use input, just use
raw_input, that sort of stuff. Not always very clean, more of a
workaround. Also stuff like chevron print is actively discouraged in
favor of using a logging module or eg sys.stderr. Of course, in py3k
where you don't have to, which is even nicer :-)

I'm guessing it's okay to link to this from the newer, more neutral pages? :-)

> The only situation in which I'd direct someone new to programming
> away from Python 3 would be if they had a specific need to use a
> library that wasn't yet supported.

Yeah, I think the reason for that rule is that the majority of people
asking about new software actually start or end up in this category.
No statistics to back that up, but the regulars seem to agree (again,
maybe we're biased). See Steve Thorne (Jerub)'s post in a parallel
thread.

Usually it's because they want to do something that people have
already solved, and #python is pretty strict about discouraging
implementing software that already exists. Of course, as the porting
of Python 3.x packages progresses this point becomes more and more
moot. A possible solution is that we suggest that people, instead of
rolling their own thing from scratch, help to port an existing good
2.x lib to 3.x, or use 2.x? I don't think it's a good idea to start
encouraging NIH in new programmers :-)

>> 2. the nicest libraries for doing a lot of stuff aren't ported yet, or
>> are in the process of being ported but not yet recommended for actual
>> use by their authors, (this seems to be a point of contention?)
>
> This has certainly been the key issue for me. ?Only in the past
> two or three months have we got to the point where I feel can commit
> to Python 3 fully. ?Six months ago, I definitely could not have
> done so. ?This is progress, and we need to be positive about it.

Yeah, that message has been in the /topic for _WAY_ longer than 6 months.

>
> Regards,
>
>
> Nick

Thank you very much for your input,
Laurens

From steve at holdenweb.com  Sun Jun 20 18:00:03 2010
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Jun 2010 01:00:03 +0900
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <20100620153833.GD20639@thorne.id.au>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100620153833.GD20639@thorne.id.au>
Message-ID: <4C1E3B03.5060802@holdenweb.com>

Stephen Thorne wrote:
> On 2010-06-19, Arc Riley wrote:
>> You mean Twisted support,
> 
> No. I don't.
> 
> Often, on #python, we get the situation where someone approaches us saying, "I
> have this problem in my python code, why does this not work for me?" and
> usually very quickly we establish the programmer has followed a tutorial or
> attempted to use a library that depends on python 2, but the programmer is
> running python 3.
> 
> Queried on why they are using python 3, the answer is frequently, "Because I
> downloaded the latest version."
> 
> For those people, we believe it is too early to use python 3. When talking to
> these people with a world view of "why shouldn't i use the latest version"
> having a concrete preexisting statement in the topic we can point to is
> invaluable.
> 
> We don't always ask those who are having python 3 problems to go to python2.
> Often we simply explain about all strings bring unicode or print now being a
> function, and the conversation dies.
> 
> There are also programmers who definately should be using python 3 for their
> work. They know who they are. They do receive support in #python.
> 
> --
> 
> In writing this email to python-dev, I have reviewed my logs of #python
> specifically looking for the phrase 'python 3'. Here are some packages that
> were named in the conversations:
> 
>  - py2exe
>  - cx_Freeze
>  - twisted 
>  - PIL
>  - ctypes
>  - email
> 
> I present this list because they are what programmers are coming to #python to
> ask about, and that may be relevent to your discussion about python 3 ports.
> 
Given the amount of interest this thread has generated I can't help
wondering why it isn't more prominent in python.org content. Is the
developer community completely disjoint with the web content editor
community?

If there is such a disconnect we should think about remedying it: a
large "Python 2 or 3?" button could link to a reasoned discussion of the
pros and cons as evinced in this thread. That way people will end up
with the right version more often (and be writing Python 2 that will
more easily migrate to Python 3, if they cannot yet use 3).

There seems to be a perception that the PSF can help fund developments,
and indeed Jesse Noller has made a small start with his sprint funding
proposal (which now has some funding behind it). I think if it is to do
so the Foundation will have to look for substantial new funding. I do
not currently understand where this funding would come from, and would
like to tap your developer creativity in helping to define how the
Foundation can effectively commit more developer time to Python.

GSoC and GHOP are great examples, but there is plenty of room for all
sorts of initiatives that result in development opportunities. I'd like
to help.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Sun Jun 20 18:00:03 2010
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Jun 2010 01:00:03 +0900
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <20100620153833.GD20639@thorne.id.au>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100620153833.GD20639@thorne.id.au>
Message-ID: <4C1E3B03.5060802@holdenweb.com>

Stephen Thorne wrote:
> On 2010-06-19, Arc Riley wrote:
>> You mean Twisted support,
> 
> No. I don't.
> 
> Often, on #python, we get the situation where someone approaches us saying, "I
> have this problem in my python code, why does this not work for me?" and
> usually very quickly we establish the programmer has followed a tutorial or
> attempted to use a library that depends on python 2, but the programmer is
> running python 3.
> 
> Queried on why they are using python 3, the answer is frequently, "Because I
> downloaded the latest version."
> 
> For those people, we believe it is too early to use python 3. When talking to
> these people with a world view of "why shouldn't i use the latest version"
> having a concrete preexisting statement in the topic we can point to is
> invaluable.
> 
> We don't always ask those who are having python 3 problems to go to python2.
> Often we simply explain about all strings bring unicode or print now being a
> function, and the conversation dies.
> 
> There are also programmers who definately should be using python 3 for their
> work. They know who they are. They do receive support in #python.
> 
> --
> 
> In writing this email to python-dev, I have reviewed my logs of #python
> specifically looking for the phrase 'python 3'. Here are some packages that
> were named in the conversations:
> 
>  - py2exe
>  - cx_Freeze
>  - twisted 
>  - PIL
>  - ctypes
>  - email
> 
> I present this list because they are what programmers are coming to #python to
> ask about, and that may be relevent to your discussion about python 3 ports.
> 
Given the amount of interest this thread has generated I can't help
wondering why it isn't more prominent in python.org content. Is the
developer community completely disjoint with the web content editor
community?

If there is such a disconnect we should think about remedying it: a
large "Python 2 or 3?" button could link to a reasoned discussion of the
pros and cons as evinced in this thread. That way people will end up
with the right version more often (and be writing Python 2 that will
more easily migrate to Python 3, if they cannot yet use 3).

There seems to be a perception that the PSF can help fund developments,
and indeed Jesse Noller has made a small start with his sprint funding
proposal (which now has some funding behind it). I think if it is to do
so the Foundation will have to look for substantial new funding. I do
not currently understand where this funding would come from, and would
like to tap your developer creativity in helping to define how the
Foundation can effectively commit more developer time to Python.

GSoC and GHOP are great examples, but there is plenty of room for all
sorts of initiatives that result in development opportunities. I'd like
to help.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From lvh at laurensvh.be  Sun Jun 20 18:15:01 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 18:15:01 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvlcl8$cjv$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>
	<hvlcl8$cjv$1@dough.gmane.org>
Message-ID: <AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>

On Sun, Jun 20, 2010 at 5:37 PM, Steve Holden <steve at holdenweb.com> wrote:
> Glyph Lefkowitz wrote:
>> On Jun 19, 2010, at 5:02 PM, Terry Reedy wrote:
> Which is yet another reason I don't think it would be productive to
> attempt any kind of pre-emptive action against the #python team. They do
> serve a very useful purpose, and there is reasoned logic behind their
> position even if we might wish it were different.
>
> regards
> ?Steve

I'm one of them so I'm a bit biased, but I'd say the biggest argument
is that it's not set in stone (I'm trying to fix it and the regulars
have been nothing but cooperative). Nobody from the #python people
realized this was a huge thing for people up until today. It's been up
there for a long time, and it's becoming less and less defensible
every passing day (and that's a good thing!), so we're basically
debating what ought to change and when. It's not really a matter of
disliking, it's more of a matter of "um, it's still up there because
nobody thought it had to go" :-)

FWIW: I think a separate #python3 channel would be a really bad idea.

thanks
Laurens

From lvh at laurensvh.be  Sun Jun 20 18:20:22 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 18:20:22 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>
	<hvlcl8$cjv$1@dough.gmane.org>
	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>
Message-ID: <AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>

Status update:

Topic now says:

NO LOL | Don't paste in here: use http://paste.pocoo.org/ |
http://pound-python.org/ | Include Python version in questions | 2.x or 3.x?
http://tinyurl.com/py2or3 | Tutorial: http://docs.python.org/tut/ | FAQ:
http://effbot.org/pyfaq/ | New Programmer? Read
http://tinyurl.com/thinkcspy2e | #python.web #wsgi #python-fr #python.de
#python-es #python.tw #python.pl #python-br #python-nl

Right now the shorturl points to the excellent
http://www.comp.leeds.ac.uk/nde/papers/teachpy3.html by Nick Efford,
until we get the Py2.x vs Py3.x page as suggested above done, which
will hopefully be in the next few hours.

pound-python.org not touched yet because 1) the appropriate person
isn't available right now 2) it's not as pressing a matter as the
other thing.


Thanks again for everyone's input on all of python-dev, #python,
#python-offtopic,
Laurens

From solipsis at pitrou.net  Sun Jun 20 19:01:59 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 19:01:59 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100620153833.GD20639@thorne.id.au>
	<4C1E3B03.5060802@holdenweb.com>
Message-ID: <20100620190159.76973b55@pitrou.net>

On Mon, 21 Jun 2010 01:00:03 +0900
Steve Holden <steve at holdenweb.com> wrote:
> 
> Given the amount of interest this thread has generated I can't help
> wondering why it isn't more prominent in python.org content. Is the
> developer community completely disjoint with the web content editor
> community?

Sorry for a naive question, but what is the web content editor
community?

Regards

Antoine.


From steve at holdenweb.com  Sun Jun 20 19:07:01 2010
From: steve at holdenweb.com (Steve Holden)
Date: Mon, 21 Jun 2010 02:07:01 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>	<hvlcl8$cjv$1@dough.gmane.org>	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>
	<AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
Message-ID: <4C1E4AB5.9070502@holdenweb.com>

Laurens Van Houtven wrote:
> Status update:
> 
> Topic now says:
> 
> NO LOL | Don't paste in here: use http://paste.pocoo.org/ |
> http://pound-python.org/ | Include Python version in questions | 2.x or 3.x?
> http://tinyurl.com/py2or3 | Tutorial: http://docs.python.org/tut/ | FAQ:
> http://effbot.org/pyfaq/ | New Programmer? Read
> http://tinyurl.com/thinkcspy2e | #python.web #wsgi #python-fr #python.de
> #python-es #python.tw #python.pl #python-br #python-nl
> 
> Right now the shorturl points to the excellent
> http://www.comp.leeds.ac.uk/nde/papers/teachpy3.html by Nick Efford,
> until we get the Py2.x vs Py3.x page as suggested above done, which
> will hopefully be in the next few hours.
> 
> pound-python.org not touched yet because 1) the appropriate person
> isn't available right now 2) it's not as pressing a matter as the
> other thing.
> 
> 
> Thanks again for everyone's input on all of python-dev, #python,
> #python-offtopic,
> Laurens
> 
And thanks for engaging so directly and responsively. The Python
community has impressed me again with its maturity.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From martin at v.loewis.de  Sun Jun 20 19:23:55 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Jun 2010 19:23:55 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
In-Reply-To: <20100620190159.76973b55@pitrou.net>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100620153833.GD20639@thorne.id.au>	<4C1E3B03.5060802@holdenweb.com>
	<20100620190159.76973b55@pitrou.net>
Message-ID: <4C1E4EAB.5000404@v.loewis.de>

Am 20.06.2010 19:01, schrieb Antoine Pitrou:
> On Mon, 21 Jun 2010 01:00:03 +0900
> Steve Holden<steve at holdenweb.com>  wrote:
>>
>> Given the amount of interest this thread has generated I can't help
>> wondering why it isn't more prominent in python.org content. Is the
>> developer community completely disjoint with the web content editor
>> community?
>
> Sorry for a naive question, but what is the web content editor
> community?

I think he's talking about the editors of www.python.org.

Regards,
Martin

From martin at v.loewis.de  Sun Jun 20 19:30:40 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Jun 2010 19:30:40 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>	<hvlcl8$cjv$1@dough.gmane.org>	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>
	<AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
Message-ID: <4C1E5040.9040504@v.loewis.de>

Am 20.06.2010 18:20, schrieb Laurens Van Houtven:
> 2.x or 3.x?
> http://tinyurl.com/py2or3

If you are interested, we could host any material that somebody would 
want to provide on http://python.org/py2or3 (which would be one letter 
shorter :-). We could also make this a redirect.

Regards,
Martin

From stephen at xemacs.org  Sun Jun 20 19:30:17 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Jun 2010 02:30:17 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100620113256.7ba8d86a@pitrou.net>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100620113256.7ba8d86a@pitrou.net>
Message-ID: <87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>

Antoine Pitrou writes:

 > I think it's an unfortunate analogy.

Propose a better one, then.  I'm definitely not wedded to the ones
I've proposed!

But we have a PR problem *now*.  The loyal opposition clearly intend
to continue trash-talking Python 3 until the libraries get to 100% (or
a government-approved approximation of 100%).  The topic on #python
seems unlikely to change at this point, with both Glyph and JP
pointedly failing to denounce it publicly, while Stephen defends it
and says it's not going to change as long as the libraries aren't
done.

What do you suggest?  Or do you think there's no PR problem we should
worry about, just accept that this going to be a further drag on
adoption and improvement, and keep on keeping on?


From martin at v.loewis.de  Sun Jun 20 19:41:37 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Jun 2010 19:41:37 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
Message-ID: <4C1E52D1.4030801@v.loewis.de>

> I can only imagine how difficult can it be to do such a conversion in
> a project like Twisted or Django where the I/O plays a fundamental
> role.

For Django, you don't need to imagine, but can look at the actual changes:

http://bitbucket.org/loewis/django-3k/

> The choice of forcing the user to use Unicode and "think in Unicode"
> was a very brave one, and I'm sure it's for the better, but not
> everyone wants to deal with that because Unicode is hard to swallow.
> The majority of people prefer to stay with bytes and eventually learn
> and introduce Unicode only when that is actually needed.

It's not really an issue with "Unicode", but rather with "characters".
Surprisingly, most people don't grasp the notion of "abstract character".

This is similar to not grasping the notion of "abstract integral 
number", which most programmers master over time (although my students 
typically need a year or more to get the difference between "decimal 
number", "two's complement", and "abstract integer"; the difference 
between "character string" and "number" is easier (*)).

For numbers, programmers are forced to accept the abstraction. For 
character strings, they apparently resist much more.

Regards,
Martin

(*) An anecdotal dialog may read like this
Teacher: "How are numbers represented in Python?"
Student: "In decimal."
T: "How so?"
S: "I can do
       x = 47
     and it is decimal. I can then do
       print x
     and get "47". See?"

From stephen at xemacs.org  Sun Jun 20 19:42:53 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Jun 2010 02:42:53 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTimWglB2_nAFoHGtwtAG9vUCUquoP_LVAmnupKcY@mail.gmail.com>
References: <mailman.6486.1277033057.32708.python-dev@python.org>
	<alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>
	<AANLkTimWglB2_nAFoHGtwtAG9vUCUquoP_LVAmnupKcY@mail.gmail.com>
Message-ID: <87eig1fvmq.fsf@uwakimon.sk.tsukuba.ac.jp>

Laurens Van Houtven writes:
 > > The only situation in which I'd direct someone new to programming
 > > away from Python 3 would be if they had a specific need to use a
 > > library that wasn't yet supported.
 > 
 > Yeah, I think the reason for that rule is that the majority of people
 > asking about new software actually start or end up in this category.

I think that the most experienced people have absurdly high standards
for "support" compared to those new to programming.  I hope they check
their advice against the real requirements of the new programmer.

 > Usually it's because they want to do something that people have
 > already solved,

If they're new to programming, they're already in adventure mode.  Why
not point out the Road Less Traveled?  That will make all the
difference.  Of course you should point out that it's going to be
bumpier, and of course that is likely to push the majority of
practical folks back to Python 2.  But some of them are likely to be
willing to endure a bit of frustration, especially if they're told
that their bug reports will be listened to seriously on python-dev
(given help from an experienced hand in formatting them!)

 > A possible solution is that we suggest that people, instead of
 > rolling their own thing from scratch, help to port an existing good
 > 2.x lib to 3.x, or use 2.x?

Exactly.  Don't give them rose-colored glasses about porting, and warn
that some are just plain broken (eg, because of inappropriate
assumptions about bytes vs Unicode).  But on the other hand, some will
mostly work for them, and their bug reports on the corner cases will
be helpful.

 > I don't think it's a good idea to start encouraging NIH in new
 > programmers :-)

Agreed.


From solipsis at pitrou.net  Sun Jun 20 19:47:47 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 19:47:47 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100620113256.7ba8d86a@pitrou.net>
	<87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100620194747.0c3d0a82@pitrou.net>

On Mon, 21 Jun 2010 02:30:17 +0900
"Stephen J. Turnbull" <stephen at xemacs.org> wrote:
> Antoine Pitrou writes:
> 
>  > I think it's an unfortunate analogy.
> 
> Propose a better one, then.  I'm definitely not wedded to the ones
> I've proposed!

I'm not sure why you want an analogy. Python 3 improves the language
and drops legacy cruft. Bringing C++ makes the description
unnecessarily contentious and loaded (because C++ has a rather bad
reputation amongst many people; recently Linus Torvalds explained
again why he thought C was much more appropriate a programming
language). And it's not even warranted, because the situation is vastly
different.

> What do you suggest?  Or do you think there's no PR problem we should
> worry about, just accept that this going to be a further drag on
> adoption and improvement, and keep on keeping on?

I suppose the PR problem could be solved by having an official page on
python.org explain what the new features and advantages of Python 3 over
Python 2 are. There's no such thing right now; actually, I'm not sure
there's a Web page explaining clearly what the difference is about, why
it was done in such a compatibility-breaking way, and what we advise
(both actual and potential) users to do.

I suppose that's a task for the "Web content editor community".

Regards

Antoine.

From turnbull at sk.tsukuba.ac.jp  Sun Jun 20 19:48:01 2010
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Mon, 21 Jun 2010 02:48:01 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
Message-ID: <87d3vlfve6.fsf@uwakimon.sk.tsukuba.ac.jp>

Laurens Van Houtven writes:

 > Also, I'm pretty sure nobody has ever said that Python 3.x was a
 > "failure", or anything like it. #python has claims that Python 3.x, as
 > a platform for building production apps, is a work in progress

How about "Python 3 is a work in progress" for the topic?  That seems
to me to strike exactly the right balance, and encourage the
interested to ask the right kind of question.


From solipsis at pitrou.net  Sun Jun 20 19:55:47 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 19:55:47 +0200
Subject: [Python-Dev] email package status in 3.X
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
Message-ID: <20100620195547.3c7882ca@pitrou.net>

On Sun, 20 Jun 2010 14:26:28 +0200
Giampaolo Rodol? <g.rodola at gmail.com> wrote:
> I attempted to port pyftpdlib to python 3 several times and the
> biggest show stopper has always been the bytes / string difference
> introduced by Python 3 which forces you to *know* and *use* Unicode
> every time you deal with some text and 2to3 is completely useless
> here.

I don't really understand what the difficulties are. A character is a
character; to convert from bytes to characters needs to know the
encoding, which your protocol should specify somewhere (of course, I
suppose FTP is old and crummy enough that it may not specify anything).

An "encoding" is nothing more than a transformation. When you get
gzipped data, you must decompress it before doing anything useful out
of it. Similarly, when you get (say) UTF-8 data, you must decode it
before doing anything useful out of it.

> I can only imagine how difficult can it be to do such a conversion in
> a project like Twisted or Django where the I/O plays a fundamental
> role.

Twisted actually seems to enforce the bytes / unicode separation quite
well already, so I don't think they should have many problems on that
front. Modern Web frameworks seem to be in the same boat (they already
give the Web developer unicode strings to play with, and handle the
encoding/decoding at the IO boundary transparently).

> The choice of forcing the user to use Unicode and "think in Unicode"
> was a very brave one, and I'm sure it's for the better, but not
> everyone wants to deal with that because Unicode is hard to swallow.

Could Google fund a project named "Unicode Swallow"?

Regards

Antoine.


From guido at python.org  Sun Jun 20 19:57:05 2010
From: guido at python.org (Guido van Rossum)
Date: Sun, 20 Jun 2010 10:57:05 -0700
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
Message-ID: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>

On Sun, Jun 20, 2010 at 5:26 AM, Giampaolo Rodol? <g.rodola at gmail.com> wrote:
> 2010/6/20 Steven D'Aprano <steve at pearwood.info>:
>> Python 2.x introduced Unicode strings. Python 3.x merely makes them the
>> default.
>
> "Merely"? To me this looks as the main reason why a lot of projects
> haven't been ported to Python 3 yet.
> I attempted to port pyftpdlib to python 3 several times and the
> biggest show stopper has always been the bytes / string difference
> introduced by Python 3 which forces you to *know* and *use* Unicode
> every time you deal with some text

Ah, but this is the crux of the difference between Python 2 and 3. The
distinction between text and bytes is crucial, and Python 2 tried to
paper over the differences in a way that led to endless pain. Many
clumsy and shaky hacks have been invented to alleviate the pain but it
never goes away. Python 3 takes a much clearer stance on the
difference -- your code *must* be aware of the distinction and it
*must* deal with it.

The problem comes exactly where you find it: when *porting* existing
code that uses aforementioned ways to alleviate the pain, you find
that the hacks no longer work and a properly layered design is needed
that clearly distinguishes between which variables contain bytes and
which text.

> and 2to3 is completely useless here.

Alas, this is true, because it is not a matter of changing some simple
things. The old ways are no longer supported.

> I can only imagine how difficult can it be to do such a conversion in
> a project like Twisted or Django where the I/O plays a fundamental
> role.

Django actually took one of the most principled stances towards this
issue and has already been ported (although the port is not maintained
by the core Django developers yet). I can't speak for Twisted but I
know they have some funding towards a port.

The problem is often worse for smaller libraries (like I presume
pyftplib is) which don't have a clear stance about bytes vs. text.

Another problem is some internet protocols (of which FTP I believe is
one) which use antiquated models for dealing with binary vs. text
data, often focusing entirely on encodings (usually and mistakenly
called "character sets") rather than on proper Unicode support.

> The choice of forcing the user to use Unicode and "think in Unicode"
> was a very brave one, and I'm sure it's for the better, but not
> everyone wants to deal with that because Unicode is hard to swallow.

Education is needed. When you search Google (or Bing, for that matter
:-) for "python unicode" the first hit is
http://www.amk.ca/python/howto/unicode, which is highly detailed but
probably too much information for the typical person faced with a
UnicodeError exception traceback (that page is also focused on Python
2). What we need is a cookbook on how to deal with various common
situations.

> The majority of people prefer to stay with bytes and eventually learn
> and introduce Unicode only when that is actually needed.

This is exactly what we tried to do in Python 2 and it was a flagrant
disaster. It's just that the work-arounds people have created to deal
with it don't port clearly -- which is by design.

This is why I've always said that I assumed that the Python 3
transition would take 5 years.

On the #python issue, I expect that IRC is much less influential that
some here fear (and than some fervent IRC users believe). I don't see
reason for panic or heavy-handed interference. OTOH engaging the
channel operators more in python-dev sounds like a useful approach.

-- 
--Guido van Rossum (python.org/~guido)

From stephen at xemacs.org  Sun Jun 20 19:58:14 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Jun 2010 02:58:14 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100620113256.7ba8d86a@pitrou.net>
	<87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <87bpb5fux5.fsf@uwakimon.sk.tsukuba.ac.jp>

Pass the ketchup, I need to eat my words.

I wrote:

 > The loyal opposition clearly intend to continue trash-talking
 > Python 3 until the libraries get to 100% (or a government-approved
 > approximation of 100%).  The topic on #python seems unlikely to
 > change at this point, with both Glyph and JP pointedly failing to
 > denounce it publicly, while Stephen defends it and says it's not
 > going to change as long as the libraries aren't done.

It would seem from posts I received after replying (local mail glitch,
should have know there was more coming :-( ) that the facts are that
the topic is quite likely to change soonish, and that "trash-talking"
is being done, if at all, by trolls.  (Having spent a few hours on
#python today, I see that's a lot more possible than I would have
believed in this community.  Nobody's immune.)

Glyph, JP, and Stephen have my personal apologies.


From lvh at laurensvh.be  Sun Jun 20 20:02:57 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 20:02:57 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100620113256.7ba8d86a@pitrou.net>
	<87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTikXAzsdIQD8nJO5NqsShb4zDR_uixfS8NaK2gjZ@mail.gmail.com>

On Sun, Jun 20, 2010 at 7:30 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Antoine Pitrou writes:
> But we have a PR problem *now*. ?The loyal opposition clearly intend
> to continue trash-talking Python 3 until the libraries get to 100% (or
> a government-approved approximation of 100%). ?The topic on #python
> seems unlikely to change at this point, with both Glyph and JP
> pointedly failing to denounce it publicly, while Stephen defends it
> and says it's not going to change as long as the libraries aren't
> done.

Huh? We just changed the topic on #python because people complained
about it. We didn't do it earlier because we didn't know it was a
problem. Defending it doesn't mean it's set in stone :-)

I don't wanna come across like a jerk but could we please not use
loaded terms like "loyal opposition" and "trash-talking"? I don't
really think that's what people do or are (or at least want to
be/intend to do). I've really honestly tried my best to fix this
situation (see the other thread) and the people whom I've gotten input
from (both here and in the IRC channels) have been nothing but
helpful.

> What do you suggest? ?Or do you think there's no PR problem we should
> worry about, just accept that this going to be a further drag on
> adoption and improvement, and keep on keeping on?

I very much like Martin and Antoine's ideas of putting the thing up on
python.org, that might also solve people's problems with the apparent
dissonance between #python and python-dev/the PSF that neither side
really wants. To the contrary, I think everyone wants this situation
to improve, including Guido, apparently. Myself included, I think
everyone stands to gain here.


thanks for listening
Laurens

From lvh at laurensvh.be  Sun Jun 20 20:08:19 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 20:08:19 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <87d3vlfve6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<87d3vlfve6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTimhRlwa0vCR0cS18r_IuAUU0weFwSZnU5bEG8y8@mail.gmail.com>

On Sun, Jun 20, 2010 at 7:48 PM, Stephen J. Turnbull
<turnbull at sk.tsukuba.ac.jp> wrote:
> Laurens Van Houtven writes:
>
> ?> Also, I'm pretty sure nobody has ever said that Python 3.x was a
> ?> "failure", or anything like it. #python has claims that Python 3.x, as
> ?> a platform for building production apps, is a work in progress
>
> How about "Python 3 is a work in progress" for the topic? ?That seems
> to me to strike exactly the right balance, and encourage the
> interested to ask the right kind of question.

I think even that's a bit too loaded, as a sign of goodwill I think
we're going to go with something completely neutral like "2.x vs 3.x".
But I'm not going to argue that ad nauseam because it's really just
bikeshedding.

thanks for your input
Laurens

From pje at telecommunity.com  Sun Jun 20 20:09:43 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 20 Jun 2010 14:09:43 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
In-Reply-To: <4C1E3B03.5060802@holdenweb.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100620153833.GD20639@thorne.id.au>
	<4C1E3B03.5060802@holdenweb.com>
Message-ID: <20100620180957.A71943A4099@sparrow.telecommunity.com>

At 01:00 AM 6/21/2010 +0900, Steve Holden wrote:
>If there is such a disconnect we should think about remedying it: a
>large "Python 2 or 3?" button could link to a reasoned discussion of the
>pros and cons as evinced in this thread. That way people will end up
>with the right version more often (and be writing Python 2 that will
>more easily migrate to Python 3, if they cannot yet use 3).

+1


From lvh at laurensvh.be  Sun Jun 20 20:15:33 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 20:15:33 +0200
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <4C1E3B03.5060802@holdenweb.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100620153833.GD20639@thorne.id.au>
	<4C1E3B03.5060802@holdenweb.com>
Message-ID: <AANLkTimFc492bSQhGYk5mHXc65jOocCRnTcj39T_hceD@mail.gmail.com>

> If there is such a disconnect we should think about remedying it: a
> large "Python 2 or 3?" button could link to a reasoned discussion of the
> pros and cons as evinced in this thread. That way people will end up
> with the right version more often (and be writing Python 2 that will
> more easily migrate to Python 3, if they cannot yet use 3).

Me and ikanobori (Simon De Vlieger) are working on this.

Laurens

From stephen at xemacs.org  Sun Jun 20 20:35:30 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Jun 2010 03:35:30 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
Message-ID: <87aaqpft71.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:

 > On the #python issue, I expect that IRC is much less influential that
 > some here fear (and than some fervent IRC users believe). I don't see
 > reason for panic or heavy-handed interference. OTOH engaging the
 > channel operators more in python-dev sounds like a useful approach.

More vice-versa, I now think.  Ie, (somewhat) greater python-dev
presence on #python is more important.  I sort of assumed that people
actually participated in #python, as a number do in c.l.p, but that
doesn't seem to be so.  At least while I was there, I didn't see
anybody else who seemed to be python-dev, whether core or the regular
denizens of the peanut gallery.

>From a few hours monitoring and participating in #python, Laurens
gives pretty accurate summary of the kind of people in the channel.  I
didn't see anything about Python 3, but I can definitely imagine there
being Python-3-baiting trolls.  There certainly were a few trollish
posters.

Anyway, what I personally plan to do is put in a couple of hours a
week on #python, and I probably mostly won't mention Python 3 unless
asked, and maybe in discussing Unicode issues.  While I don't claim to
be particularly *representative* of python-dev, an additional
dimension of diversity should go a long way.


From pje at telecommunity.com  Sun Jun 20 20:40:56 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 20 Jun 2010 14:40:56 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.c
 om>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
Message-ID: <20100620184120.10EFB3A4099@sparrow.telecommunity.com>

At 10:57 AM 6/20/2010 -0700, Guido van Rossum wrote:
>The problem comes exactly where you find it: when *porting* existing
>code that uses aforementioned ways to alleviate the pain, you find
>that the hacks no longer work and a properly layered design is needed
>that clearly distinguishes between which variables contain bytes and
>which text.

Actually, I would say that it's more that (in the network protocol 
case) we *have* bytes, some of which we would like to *treat* as 
text, yet do not wish to constantly convert back and forth to 
full-blown unicode -- especially since the protocols themselves 
designate ASCII or latin-1 at the transport layer (sometimes with 
odder encodings above, but these already have to be explicitly dealt 
with by existing code).

While reading over this thread, I'm wondering whether at least my 
(WSGI-related) problems in this area would be solved by the 
availability of a type (say "bstr") that was simply a wrapper 
providing string-like behavior over an underlying bytes, byte array, 
or memoryview, that would produce objects of compatible type when 
combined with strings (by encoding them to match).

Then, I could wrap bytes with it to pass them to string operations, 
and then feed them back into everything else.  The bstr type ideally 
would be directly compatible with bytes I/O, or at least have a 
.bytes attribute that would be.

It seems like that would reduce WSGI porting issues quite a bit, 
since it would mostly consist of throwing extra bstr() calls in where 
things are breaking, and maybe grabbing the .bytes attribute for I/O.

This approach would still be explicit as to what types you're working 
with, but would not require O(n) *conversions* at every interaction 
boundary.  It would be limited, of course, to single-byte encodings 
with all characters (0-255) valid.

OTOH, maybe there should just be a bytestrings module with 
bytestrings.ascii and bytestrings.latin1, and between the two that 
should cover the network protocol needs quite well.

Actually, if the Python 3 str() constructor could do O(1) conversion 
for the latin-1 case (i.e., just wrapped the underlying bytes), I 
would just put, "bstr = lambda x: str(x,'latin-1')" at the top of my 
programs and have roughly the same effect.

This idea is still a bit half-baked, but a more baked version might 
be just the ticket for porting stuff that used str to work with bytes 
in 2.x, if only because writing, e.g.:

      newurl = bstr(urljoin(bstr(base), 'subdir'))

seems so much saner than writing *this* everywhere:

      newurl = str(urljoin(str(base, 'latin-1'), 'subdir'), 'latin-1')

It is perhaps a bit late to propose this idea, since ideally we would 
also want to use it in 2.x to aid porting.  But I'm curious if any 
other people here experiencing byte/unicode woes in relation to 
network protocols would find this a solution to their chief 
frustration.  (i.e., that the stdlib often insists now on strings, 
where effectively bytes were usable before, and thus one must do 
conversions both coming and going.)


From benjamin at python.org  Sun Jun 20 20:49:02 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 20 Jun 2010 13:49:02 -0500
Subject: [Python-Dev] issue 8959
Message-ID: <AANLkTil-NQC0hx-PzW0i0RV1yyJ8ADvFMhw7ylkMDx8l@mail.gmail.com>

We currently have one release blocker for 2.7:
http://bugs.python.org/issue8959 It is a Windows and a ctypes
regression. As far as I can tell, the offending revision could just be
reverted but it does not merge cleanly. Can anyone offer more
expertise?

-- 
Regards,
Benjamin

From martin at v.loewis.de  Sun Jun 20 21:10:54 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 20 Jun 2010 21:10:54 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <87d3vlfve6.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<87d3vlfve6.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <4C1E67BE.7050804@v.loewis.de>

Am 20.06.2010 19:48, schrieb Stephen J. Turnbull:
> Laurens Van Houtven writes:
>
>   >  Also, I'm pretty sure nobody has ever said that Python 3.x was a
>   >  "failure", or anything like it. #python has claims that Python 3.x, as
>   >  a platform for building production apps, is a work in progress
>
> How about "Python 3 is a work in progress" for the topic?

I wouldn't say that, either - not more than Python 2 was a work in 
progress over the last 10 years.

Regards,
Martin

From lvh at laurensvh.be  Sun Jun 20 21:25:41 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 21:25:41 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <87eig1fvmq.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <mailman.6486.1277033057.32708.python-dev@python.org>
	<alpine.LFD.2.00.1006201406170.30927@cslin-gps.csunix.comp.leeds.ac.uk>
	<AANLkTimWglB2_nAFoHGtwtAG9vUCUquoP_LVAmnupKcY@mail.gmail.com>
	<87eig1fvmq.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTila9QCtsnZFb_iSH-9fcZtakWH9U0MsAb48GTlV@mail.gmail.com>

On Sun, Jun 20, 2010 at 7:42 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Laurens Van Houtven writes:
> ?> Yeah, I think the reason for that rule is that the majority of people
> ?> asking about new software actually start or end up in this category.
>
> I think that the most experienced people have absurdly high standards
> for "support" compared to those new to programming. ?I hope they check
> their advice against the real requirements of the new programmer.

Maybe. I'm not very sure about this: for example quite a few parts in
Twisted are pretty hazy voodoo magic to me ;-) I actually recommend
the high standards stuff to newbies specifically because it's high
standards. If I meet some bug, I can probably work around it, but I
imagine that it'd be much more frustrating for a newbie to come into
contact with a bunch of stuff that really isn't very well polished or
supported? I could be wrong.

> ?> Usually it's because they want to do something that people have
> ?> already solved,
>
> If they're new to programming, they're already in adventure mode. ?Why
> not point out the Road Less Traveled? ?That will make all the
> difference. ?Of course you should point out that it's going to be
> bumpier, and of course that is likely to push the majority of
> practical folks back to Python 2.

Three big reasons I can think of: because it doesn't always exist,
because even if it does exist we don't always know about it, and
because people actually helping people in #python would be far less
adept at helping people with it :-) We have a bunch of people that end
up doing their own thing anyway now, that just means we can't be as
helpful later when they have more questions.

> But some of them are likely to be
> willing to endure a bit of frustration, especially if they're told
> that their bug reports will be listened to seriously on python-dev
> (given help from an experienced hand in formatting them!)

Maybe that would help, yeah. We have a bunch of people now that start
and then give up. They don't port, because they can't be bothered.
They just start from scratch.

> ?> A possible solution is that we suggest that people, instead of
> ?> rolling their own thing from scratch, help to port an existing good
> ?> 2.x lib to 3.x, or use 2.x?
>
> Exactly. ?Don't give them rose-colored glasses about porting, and warn
> that some are just plain broken (eg, because of inappropriate
> assumptions about bytes vs Unicode). ?But on the other hand, some will
> mostly work for them, and their bug reports on the corner cases will
> be helpful.

I think that's usually more effort than new programmers are willing to
put in, people tend to underestimate the cost of developing something
from scratch in my experience. But sure, we all agree it's a good
idea, so let's put it in the official thing about 2.x vs 3.x :)

> ?> I don't think it's a good idea to start encouraging NIH in new
> ?> programmers :-)
>
> Agreed.

I think we're kind of getting into the territory of personal preferences here.

Thanks for your input,
Laurens

From lvh at laurensvh.be  Sun Jun 20 21:34:43 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 21:34:43 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <4C1E67BE.7050804@v.loewis.de>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<87d3vlfve6.fsf@uwakimon.sk.tsukuba.ac.jp>
	<4C1E67BE.7050804@v.loewis.de>
Message-ID: <AANLkTikwIwamT0q9z6KQuqg7bs8VKALbb2OMfsLWmOe-@mail.gmail.com>

On Sun, Jun 20, 2010 at 9:10 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> Am 20.06.2010 19:48, schrieb Stephen J. Turnbull:
>> How about "Python 3 is a work in progress" for the topic?
>
> I wouldn't say that, either - not more than Python 2 was a work in progress
> over the last 10 years.
>
> Regards,
> Martin

Yeah, this is why I really like a completely neutral topic.

thanks,
Laurens

From fuzzyman at voidspace.org.uk  Sun Jun 20 21:55:17 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sun, 20 Jun 2010 20:55:17 +0100
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
 status in 3.X)
In-Reply-To: <4C1E3B03.5060802@holdenweb.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100620153833.GD20639@thorne.id.au>
	<4C1E3B03.5060802@holdenweb.com>
Message-ID: <4C1E7225.70305@voidspace.org.uk>

On 20/06/2010 17:00, Steve Holden wrote:
> [snip...]
>> --
>>
>> In writing this email to python-dev, I have reviewed my logs of #python
>> specifically looking for the phrase 'python 3'. Here are some packages that
>> were named in the conversations:
>>
>>   - py2exe
>>   - cx_Freeze
>>   - twisted
>>   - PIL
>>   - ctypes
>>      

What is the problem with ctypes in Python 3? Are there particular 
problems with it - it is part of the standard library and available right?

>>   - email
>>
>> I present this list because they are what programmers are coming to #python to
>> ask about, and that may be relevent to your discussion about python 3 ports.
>>
>>      
> Given the amount of interest this thread has generated I can't help
> wondering why it isn't more prominent in python.org content. Is the
> developer community completely disjoint with the web content editor
> community?
>    

The "web content editor community" (the python.org webmasters) is really 
just a handful of people. I did suggest a few weeks ago (in response to 
an enquiry about why there was no guide to choosing between Python 2 and 
3 easily visible on the website) that we add or prominently link to a 
page with information like this. There was no response but I do think it 
would be a good idea.


> If there is such a disconnect we should think about remedying it: a
> large "Python 2 or 3?" button could link to a reasoned discussion of the
> pros and cons as evinced in this thread. That way people will end up
> with the right version more often (and be writing Python 2 that will
> more easily migrate to Python 3, if they cannot yet use 3).
>    

Yep.

All the best,

Michael Foord

> There seems to be a perception that the PSF can help fund developments,
> and indeed Jesse Noller has made a small start with his sprint funding
> proposal (which now has some funding behind it). I think if it is to do
> so the Foundation will have to look for substantial new funding. I do
> not currently understand where this funding would come from, and would
> like to tap your developer creativity in helping to define how the
> Foundation can effectively commit more developer time to Python.
>
> GSoC and GHOP are great examples, but there is plenty of room for all
> sorts of initiatives that result in development opportunities. I'd like
> to help.
>
> regards
>   Steve
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From jnoller at gmail.com  Sun Jun 20 21:59:49 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sun, 20 Jun 2010 15:59:49 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <4C1E3B03.5060802@holdenweb.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100620153833.GD20639@thorne.id.au>
	<4C1E3B03.5060802@holdenweb.com>
Message-ID: <AANLkTilsAKDVg3QoTcCLw1oSmyGe6zc_xJQnonvVna22@mail.gmail.com>

On Sun, Jun 20, 2010 at 12:00 PM, Steve Holden <steve at holdenweb.com> wrote:
...snip
>>
> Given the amount of interest this thread has generated I can't help
> wondering why it isn't more prominent in python.org content. Is the
> developer community completely disjoint with the web content editor
> community?

Yes.

> If there is such a disconnect we should think about remedying it: a
> large "Python 2 or 3?" button could link to a reasoned discussion of the
> pros and cons as evinced in this thread. That way people will end up
> with the right version more often (and be writing Python 2 that will
> more easily migrate to Python 3, if they cannot yet use 3).

Yes; the website needs to change.

> There seems to be a perception that the PSF can help fund developments,
> and indeed Jesse Noller has made a small start with his sprint funding
> proposal (which now has some funding behind it). I think if it is to do
> so the Foundation will have to look for substantial new funding. I do
> not currently understand where this funding would come from, and would
> like to tap your developer creativity in helping to define how the
> Foundation can effectively commit more developer time to Python.

The good news is that I've already had a few potential companies
approach me to inquire as to the possibility of sponsoring porting
sprints for specific itches they have. I am going to continue to
encourage this on my end, as well as redirecting them to direct PSF
donations as they arise.

I suspect; if we were to keep pushing the concept of sponsored sprints
/ bounties on Python 3 library porting, we could see things pick up
donation wise. I've long suspected that there are companies out there
who do have funds, but lack a target, and don't see a general PSF
donation as directly beneficial to their goals (although we will
continue to work to convince them otherwise).

> GSoC and GHOP are great examples, but there is plenty of room for all
> sorts of initiatives that result in development opportunities. I'd like
> to help.

Quick, off the seat of my pants idea - let's start by encouraging and
advertising sponsored sprints in the vein I've outlined in my existing
approved proposal. Once we know how to allow companies to donate
directly into a fund for direct improvement / porting, we provide them
a target which allows them to see measurable outcomes.

Just a thought

Jesse

From jnoller at gmail.com  Sun Jun 20 22:10:00 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sun, 20 Jun 2010 16:10:00 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100620184120.10EFB3A4099@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
Message-ID: <AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>

On Sun, Jun 20, 2010 at 2:40 PM, P.J. Eby <pje at telecommunity.com> wrote:
> At 10:57 AM 6/20/2010 -0700, Guido van Rossum wrote:
>>
>> The problem comes exactly where you find it: when *porting* existing
>> code that uses aforementioned ways to alleviate the pain, you find
>> that the hacks no longer work and a properly layered design is needed
>> that clearly distinguishes between which variables contain bytes and
>> which text.
>
> Actually, I would say that it's more that (in the network protocol case) we
> *have* bytes, some of which we would like to *treat* as text, yet do not
> wish to constantly convert back and forth to full-blown unicode --
> especially since the protocols themselves designate ASCII or latin-1 at the
> transport layer (sometimes with odder encodings above, but these already
> have to be explicitly dealt with by existing code).
>
> While reading over this thread, I'm wondering whether at least my
> (WSGI-related) problems in this area would be solved by the availability of
> a type (say "bstr") that was simply a wrapper providing string-like behavior
> over an underlying bytes, byte array, or memoryview, that would produce
> objects of compatible type when combined with strings (by encoding them to
> match).
>
> Then, I could wrap bytes with it to pass them to string operations, and then
> feed them back into everything else. ?The bstr type ideally would be
> directly compatible with bytes I/O, or at least have a .bytes attribute that
> would be.
>
> It seems like that would reduce WSGI porting issues quite a bit, since it
> would mostly consist of throwing extra bstr() calls in where things are
> breaking, and maybe grabbing the .bytes attribute for I/O.
>
> This approach would still be explicit as to what types you're working with,
> but would not require O(n) *conversions* at every interaction boundary. ?It
> would be limited, of course, to single-byte encodings with all characters
> (0-255) valid.
>
> OTOH, maybe there should just be a bytestrings module with bytestrings.ascii
> and bytestrings.latin1, and between the two that should cover the network
> protocol needs quite well.
>
> Actually, if the Python 3 str() constructor could do O(1) conversion for the
> latin-1 case (i.e., just wrapped the underlying bytes), I would just put,
> "bstr = lambda x: str(x,'latin-1')" at the top of my programs and have
> roughly the same effect.
>
> This idea is still a bit half-baked, but a more baked version might be just
> the ticket for porting stuff that used str to work with bytes in 2.x, if
> only because writing, e.g.:
>
> ? ? newurl = bstr(urljoin(bstr(base), 'subdir'))
>
> seems so much saner than writing *this* everywhere:
>
> ? ? newurl = str(urljoin(str(base, 'latin-1'), 'subdir'), 'latin-1')
>
> It is perhaps a bit late to propose this idea, since ideally we would also
> want to use it in 2.x to aid porting. ?But I'm curious if any other people
> here experiencing byte/unicode woes in relation to network protocols would
> find this a solution to their chief frustration. ?(i.e., that the stdlib
> often insists now on strings, where effectively bytes were usable before,
> and thus one must do conversions both coming and going.)
>

I hate to reply with a simple +1 - but I've heard this pain and
proposal from a frightening number of people, something which allowed
you to use bytes with some of the sting methods would go a really long
way to solving a lot of peoples python 3 pain. I don't relish the idea
that once people start moving over, there might be a billion
implementations of "things like this".

jesse

From tjreedy at udel.edu  Sun Jun 20 22:34:46 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 16:34:46 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
Message-ID: <hvlu18$npp$1@dough.gmane.org>

On 6/20/2010 6:35 AM, Laurens Van Houtven wrote:

> I'm one of the active people in #python that some people dislike for
> behavior with respect to Python 3.

As I wrote, I disliked the observable, written behavior, now changed. 
You are obviously a fine person. We both love Python and have both 
contributed time for years to helping others with Python.

The premise for this branch thread was:
IF #python is really #python2 and somewhat anti-Python3,
THEN (and only then), maybe we need a #python3.

I am delighted that you have already refuted the premise with a new, 
much improved, splash topic. I now feel free to ask Python3 questions on 
the existing channel -- things like "Is issue #### applicable to 
Python3?" -- as I work on reviewing tracker issues. In that respect, 
this thread is finished for me. But I hope it is just the start of 
better cooperation and communication.

Just a few notes in addition to other responses.

> First of all I'd like to defuse the situation.

Excellently done.

> Also, I'm pretty sure nobody has ever said that Python 3.x was a
> "failure", or anything like it.

I have no idea what has been said by you or anyone on #python, but 
people *have* posted on both python-list and here on py-dev things like 
"Python3 is not ready for use. It is a failure. Do not use it." (any of 
that sound familiar? ;-) and even "Python3 should be scrapped!". I am 
relieve that you have disassociated yourself and #python from such 
sentiments.

---
On newbies and version choice: I agree with Nick Efford that people 
using Python to learn about programming may be better off with Python3. 
I am using a subset of Python3 in a book on algorithms for the reasons 
he gave and others. Not even mentioned so far in this thread is the 
availability of unicode identifiers for people with non-Latin alphabets.

Of course, Asian schoolkids are unlikely to request help on #python. And 
the point about suggesting Python2 because that is what you all are good 
at helping with, is well taken. I do think people learning Python2 now 
should have a Python3-aware guide to doing so. This

 > In the mean while, we encourage people to write code that will be easy
 > to port and behave well in 3.x: new-style classes, don't use eager
 > versions when the Py3k default is lazy and you don't actually need the
 > eager thing, use as many third party libraries as possible (the idea
 > being that this would minimize effort needed to make the switch on the
 > grand scale of things), use absolute imports always (and only explicit
 > relative, but it's discouraged), always have a full unit test suite.

is a good start. I think something like that would be good for the 
#python web page, or added to python.org somewhere.

Terry Jan Reedy


From tjreedy at udel.edu  Sun Jun 20 22:57:20 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 16:57:20 -0400
Subject: [Python-Dev] Python Library Support in 3.x (Was: email package
	status in 3.X)
In-Reply-To: <AANLkTilsAKDVg3QoTcCLw1oSmyGe6zc_xJQnonvVna22@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100620153833.GD20639@thorne.id.au>	<4C1E3B03.5060802@holdenweb.com>
	<AANLkTilsAKDVg3QoTcCLw1oSmyGe6zc_xJQnonvVna22@mail.gmail.com>
Message-ID: <hvlvbh$14k$1@dough.gmane.org>

On 6/20/2010 3:59 PM, Jesse Noller wrote:

> I suspect; if we were to keep pushing the concept of sponsored sprints
> / bounties on Python 3 library porting, we could see things pick up
> donation wise. I've long suspected that there are companies out there
> who do have funds, but lack a target, and don't see a general PSF
> donation as directly beneficial to their goals (although we will
> continue to work to convince them otherwise).

Universities **love** unrestricted donations to their general fund. But 
they bow to human nature and accept and even seek all kinds of targeted 
donations: buildings, rooms, departments, centers, institutes, programs, 
professorships, scholarships, research projects, curriculum developments 
projects, and so on. (Of course, the desire to target on the part of 
donors is also in part a recognition of human nature, that unrestricted 
funds might be used frivolously or even in a way that the donor 
considers obnoxious.) I think it would be great

So I think it good that you/PSF try doing the same. And do not ignore 
individuals.

Terry Jan Reedy


From simon at ikanobori.jp  Sun Jun 20 23:12:17 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Sun, 20 Jun 2010 23:12:17 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvlu18$npp$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
Message-ID: <63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

In reply to the recent post by Laurens and the vow I made to change  
the text which is presented on the python-commandments domain I have  
asked Laurens to write a new text on the subject.

This message is a heads up to let all of you know that this new  
article is now available on the following URL: http://python-commandments.org/python3.html

This article will probably be the featured article on #python's /topic  
regarding Python 2 or 3.

I also read some remarks about possibly having an official article up  
on the Python website and in case that happens that will take the  
place of this article.

Regards,

Simon de Vlieger

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQIcBAEBAgAGBQJMHoQxAAoJEBBSHP7i+JXfGdUP/3NsUuMAJ2DONJZE4AbQIx5G
n7UE/SD0teZpyrYYIzV/PI1m40xz5XBe+zJyNfGN7m+MNoW7lGIxHgBoTB5CU6eE
10LeNy2qR9eqRQ/NZ+t8GJul4zuGIocPglDqCX/M6KtFCmtDsgSgbLaMFEgI4lRs
vZr9I9hUX9E1r+9T50uxo/YHQm+QW/HIYVks15nOoeUalkhxlQF67vvzH8/lds/F
sl5DxXe/zo287GeOIjpDNI/+0KJtUTLop4S/cpVxxA5eNX9lgGztq1wmKCMQmKcB
FS/WfQomyEhZhTk4CtIMQ7HM51bGUHwDeoO8qIOrayTM8ucoruO0QyzmZM0yxoDY
G+GVYabTKKp9ICDaUvOMxYpRnuz/Xb10nb9HphutQ03cjR28bJLR8nuLUBmIzcJK
ICXVIcV11hD01hzGWBJ7llQeoHl9ykaZu54PqpnZ/gdUrBVJ7VRItb5b4wP/PTwJ
frtNvnVwBnuR9wfQmCV9Do1UVTAVUqjFRpoBujIgSaZCa1wyF5U+8eHVD26u8lDj
+Hva28S/MggzIbc9x3/yv070204JaZVD1Q6fR5cSWdCMHgEDnwCmRjqlqLRW7zqS
al4/JaxDiqa7RrB8+liFDijtqopy7K6a3vDK4BBHuyqWmJ9lGqVJzC0ynRE6DV7N
4+lJCEF9qLW++QgjHXR2
=qRiB
-----END PGP SIGNATURE-----

From lvh at laurensvh.be  Sun Jun 20 23:16:52 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 23:16:52 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvlu18$npp$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
Message-ID: <AANLkTikghdK7rHM8tLl14FTGqRQppfqmaX4ujZnc88_X@mail.gmail.com>

Glad to hear the efforts are so appreciated. Unfortunately not
everyone agrees, but I'm beginning to think that's the tragedy of
internet politics :)

On Sun, Jun 20, 2010 at 10:34 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/20/2010 6:35 AM, Laurens Van Houtven wrote:
> I have no idea what has been said by you or anyone on #python, but people
> *have* posted on both python-list and here on py-dev things like "Python3 is
> not ready for use. It is a failure. Do not use it." (any of that sound
> familiar? ;-) and even "Python3 should be scrapped!". I am relieve that you
> have disassociated yourself and #python from such sentiments.

I can understand how people coming to #python might have thought that,
in retrospect. I just wanted to make that part clear :) As for the
"Python 3.x is a failure" people, I just tune those out, and if
they're trolling about it on IRC, ban them.

> On newbies and version choice: I agree with Nick Efford that people using
> Python to learn about programming may be better off with Python3. I am using
> a subset of Python3 in a book on algorithms for the reasons he gave and
> others. Not even mentioned so far in this thread is the availability of
> unicode identifiers for people with non-Latin alphabets.

I think the difference here is probably the focus. I think you're more
interested in teaching people Python in a more academic context:
basically teaching CS through Python. #python, on the other hand, is
trying to help people build practical tools where the CS is often an
afterthought (though not as much as it is in other programming
language channels which I won't name).

>> In the mean while, we encourage people to write code that will be easy
>> to port and behave well in 3.x: new-style classes, don't use eager
>> versions when the Py3k default is lazy and you don't actually need the
>> eager thing, use as many third party libraries as possible (the idea
>> being that this would minimize effort needed to make the switch on the
>> grand scale of things), use absolute imports always (and only explicit
>> relative, but it's discouraged), always have a full unit test suite.
>
> is a good start. I think something like that would be good for the #python
> web page, or added to python.org somewhere.

Yeah, it's actually extremely prevalent, it's just not voiced
anywhere, we could probably put it up somewhere. It's sort of up in
the pound-python page but it's well-hidden in tongue-in-cheek, as
Antoine pointed out :)

> Terry Jan Reedy
>

Laurens

From lvh at laurensvh.be  Sun Jun 20 23:18:34 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Sun, 20 Jun 2010 23:18:34 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
Message-ID: <AANLkTimRla7j7cAnZeGZnCbplm5AHA08ngfnerDscsac@mail.gmail.com>

That's not actually up just yet, I'd like people to review it,
personally I think it's still a tad bit biased towards Py3k. Until
then I'm keeping the Py3.x document by Nick Efford up there.

Thanks for your continued participation and seemingly endless patience,
Laurens

From amk at amk.ca  Sun Jun 20 23:22:09 2010
From: amk at amk.ca (A.M. Kuchling)
Date: Sun, 20 Jun 2010 17:22:09 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
Message-ID: <20100620212209.GA5319@andrew-kuchlings-macbook.local>

On Sun, Jun 20, 2010 at 10:57:05AM -0700, Guido van Rossum wrote:
> Education is needed. When you search Google (or Bing, for that matter
> :-) for "python unicode" the first hit is
> http://www.amk.ca/python/howto/unicode, which is highly detailed but
> probably too much information for the typical person faced with a
> UnicodeError exception traceback (that page is also focused on Python
> 2). What we need is a cookbook on how to deal with various common

Eep!  That should be directed to
http://docs.python.org/howto/unicode.html, the copy that's actually
incorporated in the Python docs.  I'll fix that immediately.

Regarding a smaller document for people who hit a UnicodeError
exception: could we write a little Unicode FAQ for python.org?

--amk


From tjreedy at udel.edu  Sun Jun 20 23:30:50 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 17:30:50 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
Message-ID: <hvm1ab$e7t$1@dough.gmane.org>

On 6/20/2010 8:26 AM, Giampaolo Rodol? wrote:

> I attempted to port pyftpdlib to python 3 several times and the
> biggest show stopper has always been the bytes / string difference
> introduced by Python 3 which forces you to *know* and *use* Unicode
> every time you deal with some text and 2to3 is completely useless
> here.

I believe the advice in the wiki porting page is to use unicode() and 
bytes() but never str(), in a version that runs in 2.6. Then 2to3 should 
do fine. For 2.5-, add 'bytes = str' somewhere.

2to3 still gets patches, I believe, when someone exhibits code that 
could and ought to be converted but is not.

I suspect that if you posted 'Problems porting pyftpdlib to Python3', 
you would get some help. If it involved inadequacies in the current 
tools and guides, it would to be be on-topic here. Or try python-list.

> The choice of forcing the user to use Unicode and "think in Unicode"
> was a very brave one, and I'm sure it's for the better, but not
> everyone wants to deal with that because Unicode is hard to swallow.

I felt that way until my daughter decided to switch from Spanish to 
Japanese for here foreign language. Once I quit fighting it, it because 
much easier to swallow and learn. As it turns out, thinking in Unicode 
is a pretty straightforward generalization of thinking in ascii. There 
are some annoying glitches due to the need to accomodate legacy systems. 
The plethora of legacy encodings for various subsets, besides ascii, is 
also a nuisance.

> The majority of people

who use latin-char alphabets

> prefer to stay with bytes and eventually learn
> and introduce Unicode only when that is actually needed.

The example at
http://code.google.com/p/pyftpdlib/
uses names and filenames. Without unicode, these are restricted to 
ascii, unless you use multiple encodings, which to me would be worse.

Terry Jan Reedy


From solipsis at pitrou.net  Sun Jun 20 23:47:23 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 20 Jun 2010 23:47:23 +0200
Subject: [Python-Dev] bytes / unicode
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.c om>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
Message-ID: <20100620234723.600ad4a8@pitrou.net>

On Sun, 20 Jun 2010 14:40:56 -0400
"P.J. Eby" <pje at telecommunity.com> wrote:
> 
> Actually, I would say that it's more that (in the network protocol 
> case) we *have* bytes, some of which we would like to *treat* as 
> text, yet do not wish to constantly convert back and forth to 
> full-blown unicode

Well, then why don't you just stick with a bytes object?

> While reading over this thread, I'm wondering whether at least my 
> (WSGI-related) problems in this area would be solved by the 
> availability of a type (say "bstr") that was simply a wrapper 
> providing string-like behavior over an underlying bytes, byte array, 
> or memoryview, that would produce objects of compatible type when 
> combined with strings (by encoding them to match).

This really sounds horrible. Python 3 was designed precisely to
discourage ad hoc mixing of bytes and unicode.

> Actually, if the Python 3 str() constructor could do O(1) conversion 
> for the latin-1 case (i.e., just wrapped the underlying bytes), I 
> would just put, "bstr = lambda x: str(x,'latin-1')" at the top of my 
> programs and have roughly the same effect.

Did you do any measurements that show that latin-1 decoding (hardly a
complicated task) introduces a performance regression in Web frameworks
in 3.x?

> seems so much saner than writing *this* everywhere:
> 
>       newurl = str(urljoin(str(base, 'latin-1'), 'subdir'), 'latin-1')

urljoin already returns an str object. Why do you want to decode it
again?


From ncoghlan at gmail.com  Sun Jun 20 23:53:38 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Jun 2010 07:53:38 +1000
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
Message-ID: <AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>

On Mon, Jun 21, 2010 at 7:12 AM, Simon de Vlieger <simon at ikanobori.jp> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> In reply to the recent post by Laurens and the vow I made to change the text
> which is presented on the python-commandments domain I have asked Laurens to
> write a new text on the subject.
>
> This message is a heads up to let all of you know that this new article is
> now available on the following URL:
> http://python-commandments.org/python3.html

That's a fairly decent write-up in my opinion. As Laurens pointed, it
trends towards the "use Python 3 if you can, Python 2 if you need to"
point of view, which I personally think is the right spin to be
putting on this issue, but obviously opinions will vary on that front.

About the only specific wording tweak I would suggest is that "little
regard for backwards compatibility" should be phrased as "less regard
for backwards compatibility". There were still quite a few ideas we
rejected as gratuitously incompatible, even for Py3k (the eventual
decision to retain old-style string formatting comes to mind).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From benjamin at python.org  Sun Jun 20 23:55:03 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Sun, 20 Jun 2010 16:55:03 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100620234723.600ad4a8@pitrou.net>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
Message-ID: <AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>

2010/6/20 Antoine Pitrou <solipsis at pitrou.net>:
> On Sun, 20 Jun 2010 14:40:56 -0400
> "P.J. Eby" <pje at telecommunity.com> wrote:
>>
>> Actually, I would say that it's more that (in the network protocol
>> case) we *have* bytes, some of which we would like to *treat* as
>> text, yet do not wish to constantly convert back and forth to
>> full-blown unicode
>
> Well, then why don't you just stick with a bytes object?

There are not many tools for treating bytes as text.


-- 
Regards,
Benjamin

From tjreedy at udel.edu  Mon Jun 21 00:00:27 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 18:00:27 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<87fx0ikqw5.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100620113256.7ba8d86a@pitrou.net>
	<87fx0hfw7q.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <hvm31s$pr8$1@dough.gmane.org>

On 6/20/2010 1:30 PM, Stephen J. Turnbull wrote:
> The topic on #python seems unlikely to change at this point

I just verified that, thanks to Laurens and whoever, it has been.
It is now rather good.

Terry Jan Reedy


From lvh at laurensvh.be  Mon Jun 21 00:01:08 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 21 Jun 2010 00:01:08 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvm1ab$e7t$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<hvm1ab$e7t$1@dough.gmane.org>
Message-ID: <AANLkTin9dtnA72ATTSWwzBpoK5RegmNaTvMM5QVPtPlO@mail.gmail.com>

On Sun, Jun 20, 2010 at 11:30 PM, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/20/2010 8:26 AM, Giampaolo Rodol? wrote:
>
>> I attempted to port pyftpdlib to python 3 several times and the
>> biggest show stopper has always been the bytes / string difference
>> introduced by Python 3 which forces you to *know* and *use* Unicode
>> every time you deal with some text and 2to3 is completely useless
>> here.
>
> I believe the advice in the wiki porting page is to use unicode() and
> bytes() but never str(), in a version that runs in 2.6. Then 2to3 should do
> fine. For 2.5-, add 'bytes = str' somewhere.

Really? I thought you were supposed to call encode/decode methods on
the appropriate thing, depending if they're coming from a byte source
or a character source. The problems arise when you're doing things
like paths, which I believe are bytes on *nix and proper Unicode on
Windows (which basically just means they enforce an encoding, UTF-16
if I'm not mistaken). I don't actually use Windows so I might be
completely wrong here.

> 2to3 still gets patches, I believe, when someone exhibits code that could
> and ought to be converted but is not.
>
> I suspect that if you posted 'Problems porting pyftpdlib to Python3', you
> would get some help. If it involved inadequacies in the current tools and
> guides, it would to be be on-topic here. Or try python-list.
>
>> The choice of forcing the user to use Unicode and "think in Unicode"
>> was a very brave one, and I'm sure it's for the better, but not
>> everyone wants to deal with that because Unicode is hard to swallow.
>
> I felt that way until my daughter decided to switch from Spanish to Japanese
> for here foreign language. Once I quit fighting it, it because much easier
> to swallow and learn. As it turns out, thinking in Unicode is a pretty
> straightforward generalization of thinking in ascii. There are some annoying
> glitches due to the need to accomodate legacy systems. The plethora of
> legacy encodings for various subsets, besides ascii, is also a nuisance.

I think doing unicode/str properly in 2.x is very important, #python
stresses it quite often, I think Py3k's strictness is a good idea
because people very often write something that appears to work for a
long time, and then someone tries it using funny bytes, and everything
blows apart. Convincing people their software is wrong when
"everything worked five minutes ago" is really hard :-)

You'd be surprised how long it can take before some of these problems
are found, a couple of weeks ago in #python we had exactly this
problem when we were helping Blender folks. There was a bug report
from a German Blender user, turns out Blender ignores unicode in some
critical spot making importing between people who disagree on charsets
impossible. And Blender isn't exactly a project that's two weeks old
and filled with idiots :) The downside is that *fixing* them then
becomes a nontrivial task.

The central problem is probably that a lot of people don't understand
Unicode. Recently I learned that even Tanenbaum got it wrong in his
latest revision of the computer networks book! (Although that might
just be my dutch translation of it being bad).

> Terry Jan Reedy

Laurens

From ncoghlan at gmail.com  Mon Jun 21 00:08:47 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Jun 2010 08:08:47 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
Message-ID: <AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>

> I hate to reply with a simple +1 - but I've heard this pain and
> proposal from a frightening number of people, something which allowed
> you to use bytes with some of the sting methods would go a really long
> way to solving a lot of peoples python 3 pain. I don't relish the idea
> that once people start moving over, there might be a billion
> implementations of "things like this".

My concern with it would be creating the temptation to use these new
objects that can't tolerate multibyte or variable character length
encodings when the general string type was more relevant (thus to some
degree perpetuating Python 2.x issues with incomplete Unicode
handling).

Perhaps if people could identify which specific string methods are
causing problems? In 3.2, there really aren't that many differences
between the available methods for strings and bytes:

>>> set(dir(str)) - set(dir(bytes))
{'isprintable', 'format', '__mod__', 'encode', 'isidentifier',
'_formatter_field_name_split', 'isnumeric', '__rmod__', 'isdecimal',
'_formatter_parser'}
>>> set(dir(bytes)) - set(dir(str))
{'decode', 'fromhex'}

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From tjreedy at udel.edu  Mon Jun 21 00:21:20 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 18:21:20 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
Message-ID: <hvm491$2a8$1@dough.gmane.org>

On 6/20/2010 4:10 PM, Jesse Noller wrote:
> On Sun, Jun 20, 2010 at 2:40 PM, P.J. Eby<pje at telecommunity.com>  wrote:

>> While reading over this thread, I'm wondering whether at least my
>> (WSGI-related) problems in this area would be solved by the availability of
>> a type (say "bstr") that was simply a wrapper providing string-like behavior
>> over an underlying bytes, byte array, or memoryview, that would produce
>> objects of compatible type when combined with strings (by encoding them to
>> match).

> I hate to reply with a simple +1 - but I've heard this pain and
> proposal from a frightening number of people, something which allowed
> you to use bytes with some of the sting methods would go a really long
> way to solving a lot of peoples python 3 pain. I don't relish the idea
> that once people start moving over, there might be a billion
> implementations of "things like this".

Given that the 3.x bytes and bytearray classes do retain text methods 
like .capitalize(), which are meaningless for arbitrary binary data, it 
is not clear to me what you are asking for or what problem a new class 
would solve. I am curious though.

Terry Jan Reedy


From jnoller at gmail.com  Mon Jun 21 00:28:35 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sun, 20 Jun 2010 18:28:35 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvm491$2a8$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<hvm491$2a8$1@dough.gmane.org>
Message-ID: <ABBFEF88-77D7-438E-AB56-42FDD5CB287E@gmail.com>


On Jun 20, 2010, at 6:21 PM, Terry Reedy <tjreedy at udel.edu> wrote:

> On 6/20/2010 4:10 PM, Jesse Noller wrote:
>> On Sun, Jun 20, 2010 at 2:40 PM, P.J. Eby<pje at telecommunity.com>   
>> wrote:
>
>>> While reading over this thread, I'm wondering whether at least my
>>> (WSGI-related) problems in this area would be solved by the  
>>> availability of
>>> a type (say "bstr") that was simply a wrapper providing string- 
>>> like behavior
>>> over an underlying bytes, byte array, or memoryview, that would  
>>> produce
>>> objects of compatible type when combined with strings (by encoding  
>>> them to
>>> match).
>
>> I hate to reply with a simple +1 - but I've heard this pain and
>> proposal from a frightening number of people, something which allowed
>> you to use bytes with some of the sting methods would go a really  
>> long
>> way to solving a lot of peoples python 3 pain. I don't relish the  
>> idea
>> that once people start moving over, there might be a billion
>> implementations of "things like this".
>
> Given that the 3.x bytes and bytearray classes do retain text  
> methods like .capitalize(), which are meaningless for arbitrary  
> binary data, it is not clear to me what you are asking for or what  
> problem a new class would solve. I am curious though.
>

Ask the web-sig and wsgi folks for starters. I know they've  
experienced non-zero pain.

From robertc at robertcollins.net  Mon Jun 21 00:41:37 2010
From: robertc at robertcollins.net (Robert Collins)
Date: Mon, 21 Jun 2010 10:41:37 +1200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
Message-ID: <AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>

Also, url's are bytestrings - by definition; if the standard library
has made them unicode objects in 3, I expect a lot of pain in the
webserver space.

-Rob

From tjreedy at udel.edu  Mon Jun 21 00:57:33 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 18:57:33 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
Message-ID: <hvm6cu$gaq$1@dough.gmane.org>

On 6/20/2010 5:53 PM, Nick Coghlan wrote:
> On Mon, Jun 21, 2010 at 7:12 AM, Simon de Vlieger<simon at ikanobori.jp>  wrote:
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> In reply to the recent post by Laurens and the vow I made to change the text
>> which is presented on the python-commandments domain I have asked Laurens to
>> write a new text on the subject.

> That's a fairly decent write-up in my opinion. As Laurens pointed, it
> trends towards the "use Python 3 if you can, Python 2 if you need to"
> point of view, which I personally think is the right spin to be
> putting on this issue, but obviously opinions will vary on that front.
>
> About the only specific wording tweak I would suggest is that "little
> regard for backwards compatibility" should be phrased as "less regard
> for backwards compatibility". There were still quite a few ideas we
> rejected as gratuitously incompatible, even for Py3k (the eventual
> decision to retain old-style string formatting comes to mind).

I have much the same opinion, and the ame suggestion, as Nick. People do 
not usually see the proposals that were rejected and the changes not 
made in 3.0. For those who *do* wish, there are about 25 items listed at

http://www.python.org/dev/peps/pep-3099/
Things that will Not Change in Python 3000

Nick listed one thing not on the list. Eliminating the duplicate method 
names in the unittest module is another. (In isolation, most everyone 
was in favor. Guido's reason for leaving the duplication: porting 2 to 3 
is much easier with a good (and stable) test suite. Therefore, cleaning 
up unittest and possibly breaking test suites, even with a 2to3 
conversion, would not be a good idea.)

Terry Jan Reedy


From brett at python.org  Mon Jun 21 00:58:42 2010
From: brett at python.org (Brett Cannon)
Date: Sun, 20 Jun 2010 15:58:42 -0700
Subject: [Python-Dev] Mercurial
In-Reply-To: <20100619144218.4209e881@pitrou.net>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTimOp9g8kh2TQgcNGtryq_XZQ8Hm9F8MyxPvxQM9@mail.gmail.com> 
	<87tyozcl2m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100619135104.5b0f22ed@pitrou.net> 
	<20100619121302.GB12233@remy> <20100619144218.4209e881@pitrou.net>
Message-ID: <AANLkTilhNVg-55-WjkC6mk7wRohtmOvJgObN2jvdMYjS@mail.gmail.com>

On Sat, Jun 19, 2010 at 05:42, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Sat, 19 Jun 2010 17:43:02 +0530
> Senthil Kumaran <orsenthil at gmail.com> wrote:
>> On Sat, Jun 19, 2010 at 01:51:04PM +0200, Antoine Pitrou wrote:
>> > FWIW, the EOL extension is now part of Mercurial:
>> > http://mercurial.selenic.com/wiki/EolExtension
>>
>> Should we all move soon now?
>> Any target date you have in mind, Antoine?
>
> I should point out that I am in no way responsible for the migration.
> I think Dirkjan and Brett said they would tackle this after the 2.7
> release. But they'd better answer by themselves :)

WIth the eol extension dealt with, it means all hold-ups are on
python-dev's end, not Mercurial's which is good.

As for what is left exactly, Dirkjan can answer better than I can; at
this point I am simply the guy trying to help keep the momentum going
while not doing any technical work.

From simon at ikanobori.jp  Mon Jun 21 01:05:39 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Mon, 21 Jun 2010 01:05:39 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
Message-ID: <E28D3F8A-5376-4898-BEE7-736FD507F850@ikanobori.jp>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 20 jun 2010, at 23:53, Nick Coghlan wrote:

> About the only specific wording tweak I would suggest is that "little
> regard for backwards compatibility" should be phrased as "less regard
> for backwards compatibility". There were still quite a few ideas we
> rejected as gratuitously incompatible, even for Py3k (the eventual
> decision to retain old-style string formatting comes to mind).

I have changed this text to include the wording tweak.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQIcBAEBAgAGBQJMHp7DAAoJEBBSHP7i+JXfSkMQANw1SNroVYNkDUEJCIKtdKEJ
HyGBMZpG0liUfqVf8YAjNRYEscpWtsS2Inh8PBlTUwo5OTZPmbggJVZGO17E7Z8k
ld9TASppKraNZL62nBno5us2rnc2aUJL6GCaKPL1SQkk8GG1yLAV57j8d4R50QZS
4S7ogFPgVveM4VYEZXaZrlHpzlHjdh8xjq7f4Pl8IKJQZm6uOorK+sL+jiw0DauA
UEJ53rx0agy8GRwtnOY7XvqP0lgXLfZ/axTW9e6FkKXBcHYv3qdEAvdC3wyF9OKJ
nSNo7vIj4z24V7x9WQdIcc2wifHGPqSBSfnUc4Y3TPAaPLAjlX3HX3C4J+iFbI4/
c6VIm/OSPhcuclV0IgTJGvDOoyVlxTXFnOhOobXFI3KcAtCMQw5Y9gzx+4f5nahJ
YMlu54lFhqMsBzsTMlYcispEbbAuban4aZH7JAZ645F/AMzGqiTUZyHgD+A+i+9P
Ctu7DStT4tI/ZHcsqjnSkmpLxFhr3kNZct71aS22xOpm4MBAXmPEFYa2a/LpozHi
pDhuKJbwNc/+lbgiK267IP+V2pfKQ73qMQhn6hq0IPAgBXNu8fHJ6af6bygmIr/S
sK/0zddz3C2qCgqHmYGBwYfrmQB0fgM4ic9Zi2I9/flH+6cLolhSHkOqGkH1m0DQ
totdE00iTLVuy6VEmMmm
=NcT9
-----END PGP SIGNATURE-----

From lvh at laurensvh.be  Mon Jun 21 01:06:57 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 21 Jun 2010 01:06:57 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvm6cu$gaq$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
	<hvm6cu$gaq$1@dough.gmane.org>
Message-ID: <AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>

Okay cool, we fixed it: http://python-commandments.org/python3.html

People are otherwise happy with the text?

Thanks for your continued input,
Laurens

From tjreedy at udel.edu  Mon Jun 21 01:33:39 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 19:33:39 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
Message-ID: <hvm8gk$tjq$1@dough.gmane.org>

On 6/20/2010 5:55 PM, Benjamin Peterson wrote:
> 2010/6/20 Antoine Pitrou<solipsis at pitrou.net>:
>> On Sun, 20 Jun 2010 14:40:56 -0400
>> "P.J. Eby"<pje at telecommunity.com>  wrote:
>>>
>>> Actually, I would say that it's more that (in the network protocol
>>> case) we *have* bytes, some of which we would like to *treat* as
>>> text, yet do not wish to constantly convert back and forth to
>>> full-blown unicode
>>
>> Well, then why don't you just stick with a bytes object?
>
> There are not many tools for treating bytes as text.

If one writes a function (most easily in Python)
1. in terms of the methods and operations shared by unicode and bytes, 
which is nearly all of them, and
2. does not gratuitously (and dare I say, unpythonically) do a class 
check to unnecessarily exclude one or the other, and
3. does not specialize by assuming only one of the possible values for 
type-specific constants, such as number of chars/codes, and
4. does not do something unicode specific such as normalization,
then the function should be agnostic and operate generically.

I think there was some temptation to be 'pure' and limit text methods to 
str and enforce the decode-manipulate-encode paradigm (which is 
extremely common in various forms, and nothing unusual). But for 
practicality and efficiency, that was not done.

Do you have in mind any tools that could and should operate on both, but 
do not? (I realize that at the C level, code is not just specialized to 
'unicode', but to 2-byte versus 4-byte representations.)

Terry Jan Reedy


From steve at pearwood.info  Mon Jun 21 02:03:19 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Mon, 21 Jun 2010 10:03:19 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTin9dtnA72ATTSWwzBpoK5RegmNaTvMM5QVPtPlO@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<hvm1ab$e7t$1@dough.gmane.org>
	<AANLkTin9dtnA72ATTSWwzBpoK5RegmNaTvMM5QVPtPlO@mail.gmail.com>
Message-ID: <201006211003.19488.steve@pearwood.info>

On Mon, 21 Jun 2010 08:01:08 am Laurens Van Houtven wrote:

> I think doing unicode/str properly in 2.x is very important, #python
> stresses it quite often, I think Py3k's strictness is a good idea
> because people very often write something that appears to work for a
> long time, and then someone tries it using funny bytes, and
> everything blows apart. Convincing people their software is wrong
> when "everything worked five minutes ago" is really hard :-)

Worse is when you have people who, when faced with their software 
failing to handle filenames containing non-ASCII characters ("those 
funny letters"), insist that the problem is the user for giving 
non-ASCII characters. Even when they're in the user's native 
(non-Latin) language. Even when the OS supports them.

Gah.


-- 
Steven D'Aprano

From pje at telecommunity.com  Mon Jun 21 03:33:55 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 20 Jun 2010 21:33:55 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <hvm8gk$tjq$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<hvm8gk$tjq$1@dough.gmane.org>
Message-ID: <20100621013405.19DC33A4099@sparrow.telecommunity.com>

At 07:33 PM 6/20/2010 -0400, Terry Reedy wrote:
>Do you have in mind any tools that could and should operate on both, 
>but do not?

 From http://mail.python.org/pipermail/web-sig/2009-September/004105.html :

"""The problem which arises is that unquoting of URLs in Python 3.X
stdlib can only be done on unicode strings. If though a string
contains non UTF-8 encoded characters it can fail."""

I don't have any direct experience with the specific issue 
demonstrated in that post, but in the context of the discussion as a 
whole, I understood the overall issue as "if you pass bytes to 
certain stdlib functions, you might get back unicode, an explicit 
error, or (at least in the case shown above) something that's just 
plain wrong."


From pje at telecommunity.com  Mon Jun 21 03:58:22 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 20 Jun 2010 21:58:22 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.c
 om>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
Message-ID: <20100621015824.6A84E3A4099@sparrow.telecommunity.com>

At 08:08 AM 6/21/2010 +1000, Nick Coghlan wrote:
>Perhaps if people could identify which specific string methods are
>causing problems?

__getitem__(int) returns an integer rather than a bytestring, so 
anything that manipulates individual characters can't be given bytes 
and have it work.

That was one of the key differences I had in mind for a bstr type, 
apart from  designing it to coerce normal strings to bstrs in 
cross-type operations, and to allow O(1) "conversion" to/from bytes.

Another randomly chosen byte/string incompatibility (Python 3.1; I 
don't have 3.2 handy at the moment):

 >>> os.path.join(b'x','y')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "c:\Python31\lib\ntpath.py", line 161, in join
     if b[:1] in seps:
TypeError: Type str doesn't support the buffer API

 >>> os.path.join('x',b'y')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "c:\Python31\lib\ntpath.py", line 161, in join
     if b[:1] in seps:
TypeError: 'in <string>' requires string as left operand, not bytes

Ironically, it seems to me that in trying to make the type 
distinction more rigid, Py3K fails in this area precisely because it 
is not a rigidly typed language in the Java or Haskell sense: i.e., 
os.path.join doesn't say, "I need two stringlike objects of the *same 
type*", not even in its docstring.

At least in Java, you would either implement a "path" type with 
coercions from bytes and strings, or you'd have a class with 
overloaded methods for handling join operations on bytes and strings, 
respectively, thereby avoiding this whole mess.

(Alas, this little example on the 'in' operator also shows that my 
bstr effort would probably fail anyway, because there's no 
'__rcontains__' (__lcontains__?) to allow it to override the str 
type's __contains__.)


From pje at telecommunity.com  Mon Jun 21 04:30:01 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 20 Jun 2010 22:30:01 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100620234723.600ad4a8@pitrou.net>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.c om>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
Message-ID: <20100621023005.EE17E3A4099@sparrow.telecommunity.com>

At 11:47 PM 6/20/2010 +0200, Antoine Pitrou wrote:
>On Sun, 20 Jun 2010 14:40:56 -0400
>"P.J. Eby" <pje at telecommunity.com> wrote:
> >
> > Actually, I would say that it's more that (in the network protocol
> > case) we *have* bytes, some of which we would like to *treat* as
> > text, yet do not wish to constantly convert back and forth to
> > full-blown unicode
>
>Well, then why don't you just stick with a bytes object?

Because the stdlib is not consistent in how well it handles bytes objects.


> > While reading over this thread, I'm wondering whether at least my
> > (WSGI-related) problems in this area would be solved by the
> > availability of a type (say "bstr") that was simply a wrapper
> > providing string-like behavior over an underlying bytes, byte array,
> > or memoryview, that would produce objects of compatible type when
> > combined with strings (by encoding them to match).
>
>This really sounds horrible. Python 3 was designed precisely to
>discourage ad hoc mixing of bytes and unicode.

Who said ad hoc mixing?  The point is to have a simple way to ensure 
that my bytes don't get implicitly converted to unicode, and 
(ideally) don't have to get converted *back*, either.

The idea that by passing bytes to the stdlib, I randomly get back 
either bytes or unicode (i.e. undocumentedly and inconsistently 
between different stdlib APIs, as well as possibly dependent on 
runtime conditions), is NOT "discouraging ad hoc mixing".


> > seems so much saner than writing *this* everywhere:
> >
> >       newurl = str(urljoin(str(base, 'latin-1'), 'subdir'), 'latin-1')
>
>urljoin already returns an str object. Why do you want to decode it
>again?

Ugh.  I meant:

    newurl = urljoin(str(base, 'latin-1'), 'subdir').encode('latin-1')

Which just goes to the point of how ridiculous it is to have to 
convert things to strings and back again to use APIs that ought to 
just handle bytes properly in the first place.

(I don't know if there are actually any problems in the case of 
urljoin; I wasn't the person who originally brought up the "stdlib 
not treating URLs as bytestrings in 3.x" issue on the 
Web-SIG.  Somewhere along the line I got the impression that urljoin 
was one such API, but in researching the issue it looks like maybe 
the canonical example was qsl_parse.)

It's possible that the stdlib situation has improved tremendously 
since then, of course.  I don't know if the bug was reported, or how 
many remain.

And it's precisely the part where I don't know how many remain that 
keeps me from doing more than idly thinking about porting any of my 
libraries (let alone apps) to Python 3.x.  The fact that the stdlib 
itself has these sorts of issues raises major red flags to me about 
whether the One Obvious Way has yet been found.  If the stdlib 
maintainers don't agree on the One Obvious Way, that seems even 
worse.  Or if there is such a Way, but nobody has documented its 
practices yet, that's almost the same thing.

I also find it weird that there seem to be two camps on this subject, 
one of which claims that All Is Well And There Is No Problem -- but I 
do not recall seeing anyone who was in the "What do I do; this 
doesn't seem ready" camp who switched sides and took the time to 
write down what made them realize that they were wrong about there 
being a problem, and what steps they had to take.  The existence of 
one or more such documents would certainly ease my mind, and I 
imagine that of other people who are less waiting for others' 
libraries, than for the stdlib (and/or language) itself to settle.

(Or more precisely, for it to be SEEN to have settled.)


From tjreedy at udel.edu  Mon Jun 21 05:56:17 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 20 Jun 2010 23:56:17 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621013405.19DC33A4099@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<hvm8gk$tjq$1@dough.gmane.org>
	<20100621013405.19DC33A4099@sparrow.telecommunity.com>
Message-ID: <4C1EE2E1.5030105@udel.edu>

On 6/20/2010 9:33 PM, P.J. Eby wrote:
> At 07:33 PM 6/20/2010 -0400, Terry Reedy wrote:
>> Do you have in mind any tools that could and should operate on both,
>> but do not?
>
>  From http://mail.python.org/pipermail/web-sig/2009-September/004105.html :

Thank for the concrete examples in this and your other post.
I am cc-ing the author of the above.

> """The problem which arises is that unquoting of URLs in Python 3.X
> stdlib can only be done on unicode strings.

Actually, I believe this is an encoding rather than bytes versus unicode 
issue.

 > If though a string
> contains non UTF-8 encoded characters it can fail."""

Which is to say, I believe, if the ascii text in the (unicode) string 
has a % encoding of a byte that that is not a legal utf-8 encoding of 
anything.

The specific example is

 >>> urllib.parse.parse_qsl('a=b%e0')
[('a', 'b?')]

where the character after 'b' is white ? in dark diamond, indicating an 
error.

parse_qsl() splits that input on '=' and sends each piece to 
urllib.parse.unquote
unquote() attempts to "Replace %xx escapes by their single-character 
equivalent.". unquote has an encoding parameter that defaults to 'utf-8' 
in *its* call to .decode. parse_qsl does not have an encoding parameter. 
If it did, and it passed that to unquote, then
the above example would become (simulated interaction)

 >>> urllib.parse.parse_qsl('a=b%e0', encoding='latin-1')
[('a', 'b?')]

I got that output by copying the file and adding "encoding-'latin-1'" to 
the unquote call.

Does this solve this problem?
Has anything like this been added for 3.2?
Should it be?

> I don't have any direct experience with the specific issue demonstrated
> in that post, but in the context of the discussion as a whole, I
> understood the overall issue as "if you pass bytes to certain stdlib
> functions, you might get back unicode, an explicit error, or (at least
> in the case shown above) something that's just plain wrong."

As indicated above, I so far think that the problem is with the 
application of the new model, not the model itself.

Just for 'fun', I tried feeding bytes to the function.
 >>> p.parse_qsl(b'a=b%e0')
Traceback (most recent call last):
   File "<pyshell#2>", line 1, in <module>
     p.parse_qsl(b'a=b%e0')
   File "C:\Programs\Python31\lib\urllib\parse.py", line 377, in parse_qsl
     pairs = [s2 for s1 in qs.split('&') for s2 in s1.split(';')]
TypeError: Type str doesn't support the buffer API

I do not know if that message is correct, but certainly trying to split 
bytes with unicode is (now, at least) a mistake. This could be 'fixed' 
by replacing the typed literals with expressions that match the type of 
the input. But I am not sure if that is sensible since the next step is 
to unquote and decode to unicode anyway. I just do not know the use case.

Terry Jan Reedy


From regebro at gmail.com  Mon Jun 21 06:37:18 2010
From: regebro at gmail.com (Lennart Regebro)
Date: Mon, 21 Jun 2010 06:37:18 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com> 
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
Message-ID: <AANLkTimIIMd-Kz3LXn4wZnk_J9HPk1OPntLS1dJlujyJ@mail.gmail.com>

On Sun, Jun 20, 2010 at 23:55, Benjamin Peterson <benjamin at python.org> wrote:
> There are not many tools for treating bytes as text.

Well, what tools would you need that can be used also on bytes? Bytes
objects has a lot of the same methods like strings do, and that will
cover 99% of the cases. Most text tools assume that the text really is
text, and much of it doesn't make sense unless you've converted it to
Unicode first.

But most of the things you would need to do, such as in a web-server
doesn't really involve treating the text as something linguistic, but
it's a matter of replacing and escaping and such, and that could be
done while the text is in bytes form.But the tools for that exists...

Is there some specific tool that is missing?
-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python3porting.com/
+33 661 58 14 64

From stephen at xemacs.org  Mon Jun 21 13:19:50 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Mon, 21 Jun 2010 20:19:50 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
Message-ID: <87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>

Robert Collins writes:

 > Also, url's are bytestrings - by definition;

Eh?  RFC 3896 explicitly says

    A URI is an identifier consisting of a sequence of characters
    matching the syntax rule named <URI> in Section 3.

(where the phrase "sequence of characters" appears in all ancestors I
found back to RFC 1738), and

    2.  Characters

    The URI syntax provides a method of encoding data, presumably for
    the sake of identifying a resource, as a sequence of characters.
    The URI characters are, in turn, frequently encoded as octets for
    transport or presentation.  This specification does not mandate any
    particular character encoding for mapping between URI characters
    and the octets used to store or transmit those characters.  When a
    URI appears in a protocol element, the character encoding is
    defined by that protocol; without such a definition, a URI is
    assumed to be in the same character encoding as the surrounding
    text.

 > if the standard library has made them unicode objects in 3, I
 > expect a lot of pain in the webserver space.

Yup.  But pain is inevitable if people are treating URIs (whether URLs
or otherwise) as octet sequences.  Then your base URL is gonna be
b'mailto:stephen at xemacs.org', but the natural thing the UI will want
to do is 

    formurl = baseurl + '?subject=??????????'

IMO, the UI is right.  "Something" like the above "ought" to work.

So the function that actually handles composing the URL should take a
string (ie, unicode), and do all escaping.  The UI code should not
need to know about escaping.  If nothing escapes except the function
that puts the URL in composed form, and that function always escapes,
life is easy.

Of course, in real life it's not that easy.  But it's possible to make
things unnecessarily hard for the users of your URI API(s), and one
way to do that is to make URIs into "just bytes" (and "just unicode"
is probably nearly as bad, except that at least you know it's not
ready for the wire).


From regebro at gmail.com  Mon Jun 21 14:09:33 2010
From: regebro at gmail.com (Lennart Regebro)
Date: Mon, 21 Jun 2010 14:09:33 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com> 
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>

2010/6/21 Stephen J. Turnbull <stephen at xemacs.org>:
> IMO, the UI is right. ?"Something" like the above "ought" to work.

Right. That said, many times when you want to do urlparse etc they
might be binary, and you might want binary. So maybe the methods
should work with both?

-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python3porting.com/
+33 661 58 14 64

From ncoghlan at gmail.com  Mon Jun 21 14:20:13 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Jun 2010 22:20:13 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621015824.6A84E3A4099@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
Message-ID: <AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>

On Mon, Jun 21, 2010 at 11:58 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 08:08 AM 6/21/2010 +1000, Nick Coghlan wrote:
>>
>> Perhaps if people could identify which specific string methods are
>> causing problems?
>
> __getitem__(int) returns an integer rather than a bytestring, so anything
> that manipulates individual characters can't be given bytes and have it
> work.

It can if you use length one slices rather than simple indexing.
Depending on the details, such algorithms may still fail for
multi-byte codecs though.

> That was one of the key differences I had in mind for a bstr type, apart
> from ?designing it to coerce normal strings to bstrs in cross-type
> operations, and to allow O(1) "conversion" to/from bytes.

Erk, that just sounds like a recipe for recreating the problems 2.x
has in a new form.

> Another randomly chosen byte/string incompatibility (Python 3.1; I don't
> have 3.2 handy at the moment):
>
>>>> os.path.join(b'x','y')
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> ?File "c:\Python31\lib\ntpath.py", line 161, in join
> ? ?if b[:1] in seps:
> TypeError: Type str doesn't support the buffer API
>
>>>> os.path.join('x',b'y')
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> ?File "c:\Python31\lib\ntpath.py", line 161, in join
> ? ?if b[:1] in seps:
> TypeError: 'in <string>' requires string as left operand, not bytes
>
> Ironically, it seems to me that in trying to make the type distinction more
> rigid, Py3K fails in this area precisely because it is not a rigidly typed
> language in the Java or Haskell sense: i.e., os.path.join doesn't say, "I
> need two stringlike objects of the *same type*", not even in its docstring.

I believe it actually needs the objects to be compatible with the type
of os.sep, rather than just with each other (i.e. the type
restrictions on os.path.join are the same as those on os.sep.join,
even though the join algorithm itself is slightly different). This
restriction should be mentioned in the Py3k docstring and docs for
os.path.join - if it isn't, that would be a doc bug.

> At least in Java, you would either implement a "path" type with coercions
> from bytes and strings, or you'd have a class with overloaded methods for
> handling join operations on bytes and strings, respectively, thereby
> avoiding this whole mess.
>
> (Alas, this little example on the 'in' operator also shows that my bstr
> effort would probably fail anyway, because there's no '__rcontains__'
> (__lcontains__?) to allow it to override the str type's __contains__.)

OK, these examples convince me that the incompatibility problem is
real. However, I don't think a bstr type can solve them even without
the __rcontains__ problem - it would just recreate the pain that we
already have in the 2.x world.

Something that may make sense to ease the porting process is for some
of these "on the boundary" I/O related string manipulation functions
(such as os.path.join) to grow "encoding" keyword-only arguments. The
recommended approach would be to provide all strings, but bytes could
also be accepted if an encoding was specified. (If you want to mix
encodings - tough, do the decoding yourself).

For the idea of avoiding excess copying of bytes through multiple
encoding/decoding calls... isn't that meant to be handled at an
architectural level (i.e. decode once on the way in, encode once on
the way out)? Optimising the single-byte codec case by minimising data
copying (possibly through creative use of PEP 3118) may be something
that we want to look at eventually, but it strikes me as something of
a premature optimisation at this point in time (i.e. the old adage
"first get it working, then get it working fast").

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Jun 21 14:33:08 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Jun 2010 22:33:08 +1000
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
	<hvm6cu$gaq$1@dough.gmane.org>
	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
Message-ID: <AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>

On Mon, Jun 21, 2010 at 9:06 AM, Laurens Van Houtven <lvh at laurensvh.be> wrote:
> Okay cool, we fixed it: http://python-commandments.org/python3.html
>
> People are otherwise happy with the text?

Yep, looks pretty good to me.

I hope you don't mind, but I actually borrowed your text to seed a
corresponding page on the Python wiki:
http://wiki.python.org/moin/Python2orPython3

It turns out the beginner's guide on the wiki doesn't even acknowledge
the possibility of downloading Python 3.1 rather than 2.6 to start
experimenting with Python.

The Wiki is probably a good place for this kind of material, anyway -
it makes it much easier for people to update as they identify major
third party libraries that do and don't have Py3k compatible versions
(and, some day, Python2 compatible versions).

Cheers,
Nick.

P.S. (We're going to have a tough decision to make somewhere along the
line where docs.python.org is concerned, too - when do we flick the
switch and make a 3.x version of the docs the default? We probably
won't need to seriously consider that question until the 3.3. time
frame though).

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Jun 21 14:51:03 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 21 Jun 2010 22:51:03 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621023005.EE17E3A4099@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
Message-ID: <AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>

On Mon, Jun 21, 2010 at 12:30 PM, P.J. Eby <pje at telecommunity.com> wrote:
> I also find it weird that there seem to be two camps on this subject, one of
> which claims that All Is Well And There Is No Problem -- but I do not recall
> seeing anyone who was in the "What do I do; this doesn't seem ready" camp
> who switched sides and took the time to write down what made them realize
> that they were wrong about there being a problem, and what steps they had to
> take. ?The existence of one or more such documents would certainly ease my
> mind, and I imagine that of other people who are less waiting for others'
> libraries, than for the stdlib (and/or language) itself to settle.
>
> (Or more precisely, for it to be SEEN to have settled.)

I don't know that the "all is well" camp actually exists. The camp
that I do see existing is the one that says "without a bug report,
inconsistencies in the standard library's unicode handling won't get
fixed".

The issues picked up by the regression test suite have already been
dealt with, but that suite is unfortunately far from comprehensive.
Just like a lot of Python code that is out there, the standard library
isn't immune to the poor coding practices that were permitted by the
blurry lines between text and octet streams in 2.x.

It may be that there are places where we need to rewrite standard
library algorithms to be bytes/str neutral (e.g. by using length one
slices instead of indexing). It may be that there are more APIs that
need to grow "encoding" keyword arguments that they then pass on to
the functions they call or use to convert str arguments to bytes (or
vice-versa). But without people trying to port affected libraries and
reporting bugs when they find issues, the situation isn't going to
improve.

Now, if these bugs are already being reported against 3.1 and just
aren't getting fixed, that's a completely different story...

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ben+python at benfinney.id.au  Mon Jun 21 15:17:09 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Mon, 21 Jun 2010 23:17:09 +1000
Subject: [Python-Dev] [OT] carping about irritating people (was: bytes /
	unicode)
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <871vc0plt6.fsf_-_@benfinney.id.au>

"Stephen J. Turnbull" <stephen at xemacs.org> writes:

> your base URL is gonna be b'mailto:stephen at xemacs.org', but the
> natural thing the UI will want to do is
>
>     formurl = baseurl + '?subject=??????????'

Incidentally, which irritating person was the topic of this
Japanese-language message to you?

(The subject in Stephen's example message translates roughly as
?(unspecified third person) is an irritating rascal, don't you agree?.)

-- 
 \         ?The userbase for strong cryptography declines by half with |
  `\      every additional keystroke or mouseclick required to make it |
_o__)                                             work.? ?Carl Ellison |
Ben Finney


From arcriley at gmail.com  Mon Jun 21 15:37:45 2010
From: arcriley at gmail.com (Arc Riley)
Date: Mon, 21 Jun 2010 09:37:45 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com> 
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com> 
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp> 
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com> 
	<hvm6cu$gaq$1@dough.gmane.org>
	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com> 
	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
Message-ID: <AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>

I would suggest that if packages that do not have Python 3 support yet are
listed, then their alternatives should also.

PyQt has had Py3 support for some time.
PostgreSQL and SQLite do (as does SQLAlchemy)
CherryPy has had Py3 support for the last release cycle
libxml2 does not, but lxml does.

Also, under where it mentions that most OS's do not include Python 3, it
should be noted which have good support for it.  Gentoo (for example) has
excellent support for Python 3, automatically installing Python packages
which have Py3 support for both Py2 and Py3, and the python-based Portage
package system runs cleanly on Py2.6, Py3.1 and Py3.2.

Give credit where credit is due. :-)


On Mon, Jun 21, 2010 at 8:33 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:

> On Mon, Jun 21, 2010 at 9:06 AM, Laurens Van Houtven <lvh at laurensvh.be>
> wrote:
> > Okay cool, we fixed it: http://python-commandments.org/python3.html
> >
> > People are otherwise happy with the text?
>
> Yep, looks pretty good to me.
>
> I hope you don't mind, but I actually borrowed your text to seed a
> corresponding page on the Python wiki:
> http://wiki.python.org/moin/Python2orPython3
>
> It turns out the beginner's guide on the wiki doesn't even acknowledge
> the possibility of downloading Python 3.1 rather than 2.6 to start
> experimenting with Python.
>
> The Wiki is probably a good place for this kind of material, anyway -
> it makes it much easier for people to update as they identify major
> third party libraries that do and don't have Py3k compatible versions
> (and, some day, Python2 compatible versions).
>
> Cheers,
> Nick.
>
> P.S. (We're going to have a tough decision to make somewhere along the
> line where docs.python.org is concerned, too - when do we flick the
> switch and make a 3.x version of the docs the default? We probably
> won't need to seriously consider that question until the 3.3. time
> frame though).
>
> --
> Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/arcriley%40gmail.com
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/31ecbd1a/attachment.html>

From barry at python.org  Mon Jun 21 15:57:30 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 09:57:30 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
	<hvm6cu$gaq$1@dough.gmane.org>
	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
Message-ID: <20100621095730.752157d0@heresy>

On Jun 21, 2010, at 09:37 AM, Arc Riley wrote:

>Also, under where it mentions that most OS's do not include Python 3, it
>should be noted which have good support for it.  Gentoo (for example) has
>excellent support for Python 3, automatically installing Python packages
>which have Py3 support for both Py2 and Py3, and the python-based Portage
>package system runs cleanly on Py2.6, Py3.1 and Py3.2.

We're trying to get there for Ubuntu (driven also by Debian).  We have Python
3.1.2 in main for Lucid, though we will probably not get 3.2 into Maverick
(the October 2010 release).  We're currently concentrating on Python 2.7 as a
supported version because it'll be released by then, while 3.2 will still be
in beta.

If you want to help, or have complaints, kudos, suggestions, etc. for Python
support on Ubuntu, you can contact me off-list.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/b98e5b63/attachment.pgp>

From ncoghlan at gmail.com  Mon Jun 21 16:21:31 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Jun 2010 00:21:31 +1000
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
	<hvm6cu$gaq$1@dough.gmane.org>
	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
Message-ID: <AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>

On Mon, Jun 21, 2010 at 11:37 PM, Arc Riley <arcriley at gmail.com> wrote:
> I would suggest that if packages that do not have Python 3 support yet are
> listed, then their alternatives should also.
>
> PyQt has had Py3 support for some time.
> PostgreSQL and SQLite do (as does SQLAlchemy)
> CherryPy has had Py3 support for the last release cycle
> libxml2 does not, but lxml does.
>
> Also, under where it mentions that most OS's do not include Python 3, it
> should be noted which have good support for it.? Gentoo (for example) has
> excellent support for Python 3, automatically installing Python packages
> which have Py3 support for both Py2 and Py3, and the python-based Portage
> package system runs cleanly on Py2.6, Py3.1 and Py3.2.
>
> Give credit where credit is due. :-)

A decent listing of major packages that already support Python 3 would
be very handy for the new Python2orPython3 page I created on the wiki,
and easier to keep up-to-date. (the old Early2to3Migrations page
didn't look particularly up to date, but hopefully we can keep the new
list in a happier state).

It just ticked past midnight for me, so I'm off to bed, but for anyone
with a wiki account, have at it:
http://wiki.python.org/moin/Python2orPython3

(Updating the beginner's guide to recognise Python 3 as a valid option
would also be helpful: http://wiki.python.org/moin/BeginnersGuide)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Jun 21 16:25:58 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Jun 2010 00:25:58 +1000
Subject: [Python-Dev] [OT] carping about irritating people (was: bytes /
	unicode)
In-Reply-To: <871vc0plt6.fsf_-_@benfinney.id.au>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<871vc0plt6.fsf_-_@benfinney.id.au>
Message-ID: <AANLkTikb14olArrvHtDJiTUBdAljq7t6NRkDbN446qr2@mail.gmail.com>

On Mon, Jun 21, 2010 at 11:17 PM, Ben Finney <ben+python at benfinney.id.au> wrote:
> "Stephen J. Turnbull" <stephen at xemacs.org> writes:
>
>> your base URL is gonna be b'mailto:stephen at xemacs.org', but the
>> natural thing the UI will want to do is
>>
>>     formurl = baseurl + '?subject=??????????'
>
> Incidentally, which irritating person was the topic of this
> Japanese-language message to you?
>
> (The subject in Stephen's example message translates roughly as
> ?(unspecified third person) is an irritating rascal, don't you agree?.)

Given what he said about the base URL, it would appear to be a
self-deprecating self-description. Nicely done :)

(I can pronounce that subject line, but I didn't know what it meant
without the translation).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Mon Jun 21 16:27:59 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Jun 2010 00:27:59 +1000
Subject: [Python-Dev] [OT] carping about irritating people (was: bytes /
	unicode)
In-Reply-To: <AANLkTikb14olArrvHtDJiTUBdAljq7t6NRkDbN446qr2@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<871vc0plt6.fsf_-_@benfinney.id.au>
	<AANLkTikb14olArrvHtDJiTUBdAljq7t6NRkDbN446qr2@mail.gmail.com>
Message-ID: <AANLkTinDH2PJHEZvHaGQA3gGbHEfm6c9DTdW69T2nW2x@mail.gmail.com>

> Given what he said about the base URL, it would appear to be a
> self-deprecating self-description. Nicely done :)

Gah, no it isn't, you're right, the message leaves it unspecified. OK,
no more posting after midnight for me... (well, not tonight, anyway)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From pje at telecommunity.com  Mon Jun 21 16:51:25 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 10:51:25 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.c
 om>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
Message-ID: <20100621145133.7F5333A404D@sparrow.telecommunity.com>

At 10:20 PM 6/21/2010 +1000, Nick Coghlan wrote:
>For the idea of avoiding excess copying of bytes through multiple
>encoding/decoding calls... isn't that meant to be handled at an
>architectural level (i.e. decode once on the way in, encode once on
>the way out)? Optimising the single-byte codec case by minimising data
>copying (possibly through creative use of PEP 3118) may be something
>that we want to look at eventually, but it strikes me as something of
>a premature optimisation at this point in time (i.e. the old adage
>"first get it working, then get it working fast").

The issue is, I'd like to have an idempotent incantation that I can 
use to make the inputs and outputs to stdlib functions behave in a 
type-safe manner with respect to bytes, in cases where bytes are 
really what I want operated on.

Note too that this is an argument for symmetry in wrapping the inputs 
and outputs, so that the code doesn't have to "know" what it's dealing with!

After all, right now, if a stdlib function might return bytes or 
unicode depending on runtime conditions, I can't even hardcode an 
.encode() call -- it would fail if the return type is a bytes.

This basically goes against the "tell, don't ask" pattern, and the 
Pythonically idempotent approach.  That is, Python builtins normally 
return you back the same thing if it's already what you want - 
int(someInt)-> someInt, iter(someIter)->someIter, etc.

Since this incantation may need to be used often, and in places that 
are not known to me in advance, I would like it to not impose new 
overhead in unexpected places.  (i.e., the usual argument brought 
against making changes to the 'list' type that would change certain 
operations from O(1) to O(log something)).

It's more about predictability, and having One *Obvious* Way To Do 
It, as opposed to "several ways, which you need to think carefully 
about and restructure your entire architecture around if 
necessary".  One obvious way means I can focus on the mechanical 
effort of porting *first*, without having to think.

So, the performance issue isn't really about performance *per se*, so 
much as about the "mental UI" of the language.  You could just as 
easily lie and tell me that your bstr implementation is O(1), and I 
would probably be happy and never notice, because the issue was never 
really about performance as such, but about having to *think* about 
it.  (i.e., breaking flow.)

Really, the entire issue can presumably be dealt with by some series 
of incantations - it's just code after all.  But having to sit and 
think about *every* situation where I'm dealing with bytes/unicode 
distinctions seems like a torture compared to being able to say, 
"okay, so when dealing with this sort of API and this sort of data, 
this is the One Obvious Way to do the conversions."

It's One Obvious Way that I want, but some people seem to be arguing 
that the One Obvious Way is to Think Carefully About It Every Time -- 
and that seems to violate the "Obvious" part, IMO.  ;-)


From a.badger at gmail.com  Mon Jun 21 17:28:05 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 11:28:05 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <20100621095730.752157d0@heresy>
References: <hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
	<hvm6cu$gaq$1@dough.gmane.org>
	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
	<20100621095730.752157d0@heresy>
Message-ID: <20100621152805.GU5787@unaka.lan>

On Mon, Jun 21, 2010 at 09:57:30AM -0400, Barry Warsaw wrote:
> On Jun 21, 2010, at 09:37 AM, Arc Riley wrote:
> 
> >Also, under where it mentions that most OS's do not include Python 3, it
> >should be noted which have good support for it.  Gentoo (for example) has
> >excellent support for Python 3, automatically installing Python packages
> >which have Py3 support for both Py2 and Py3, and the python-based Portage
> >package system runs cleanly on Py2.6, Py3.1 and Py3.2.
> 
> We're trying to get there for Ubuntu (driven also by Debian).  We have Python
> 3.1.2 in main for Lucid, though we will probably not get 3.2 into Maverick
> (the October 2010 release).  We're currently concentrating on Python 2.7 as a
> supported version because it'll be released by then, while 3.2 will still be
> in beta.
> 
> If you want to help, or have complaints, kudos, suggestions, etc. for Python
> support on Ubuntu, you can contact me off-list.
> 
<nod> Fedora 14 is about the same.  A nice to have thing that goes along
with these would be a table that has packages ported to python3 and which
distributions have the python3 version of the package.

Once most of the important third party packages are ported to python3 and in
the distributions, this table will likely become out-dated and probably
should be reaped but right now it's a very useful thing to see.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/b71aee50/attachment.pgp>

From arcriley at gmail.com  Mon Jun 21 17:31:08 2010
From: arcriley at gmail.com (Arc Riley)
Date: Mon, 21 Jun 2010 11:31:08 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <201006211113.06767.stephan.richter@gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com> 
	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com> 
	<201006211113.06767.stephan.richter@gmail.com>
Message-ID: <AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>

Personally, I'd like to celebrate the upcoming Python 3.2 release (which
will hopefully include 3to2) with moving all packages which do not have the
'Programming Language :: Python :: 3' classifier to a "Legacy" section of
PyPI and offer only Python 3 packages otherwise.  Of course put a banner at
the top clearly explaining that Python 2 packages can be found in the Legacy
section.

Radical, I know, but at some point we really need to make this move.

PyPI really needs a mechanism to cull out the moribund packages from being
displayed next to the actively maintained ones.  There's so many packages on
there that only work on Python 2.2-2.4 (for example), or with a specific
highly outdated version of another package, etc.


On Mon, Jun 21, 2010 at 11:13 AM, Stephan Richter <stephan.richter at gmail.com
> wrote:

> On Monday, June 21, 2010, Nick Coghlan wrote:
> > A decent listing of major packages that already support Python 3 would
> > be very handy for the new Python2orPython3 page I created on the wiki,
> > and easier to keep up-to-date. (the old Early2to3Migrations page
> > didn't look particularly up to date, but hopefully we can keep the new
> > list in a happier state).
>
> I really just want to be able to go to PyPI, Click on "Browse packages" and
> then select "Python 3" (it can currently be accomplished by clicking
> "Python"
> and then  "3"). Of course, package developers need to be encouraged to add
> these Trove classifiers so that the listings are as complete as possible.
>
> Regards,
> Stephan
> --
> Entrepreneur and Software Geek
> Google me. "Zope Stephan Richter"
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/966ac57c/attachment.html>

From lvh at laurensvh.be  Mon Jun 21 17:33:35 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 21 Jun 2010 17:33:35 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>
	<hvlu18$npp$1@dough.gmane.org>
	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>
	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>
	<hvm6cu$gaq$1@dough.gmane.org>
	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
Message-ID: <AANLkTil0avMKuvJ7a1gGA7z8Euuzdz23RmdGS-KNj47u@mail.gmail.com>

On Mon, Jun 21, 2010 at 3:37 PM, Arc Riley <arcriley at gmail.com> wrote:
> I would suggest that if packages that do not have Python 3 support yet are
> listed, then their alternatives should also.

Okay, this is being worked on.

> PyQt has had Py3 support for some time.

Added, as well as PySide.

> PostgreSQL and SQLite do (as does SQLAlchemy)

wrt Postgres: Is that psycopg2? Not sure what that's an alternative
to, since the 2.x list doesn't have any ORMs or database APIs at the
moment (unless Django counts).

> CherryPy has had Py3 support for the last release cycle

Okay, going to add it but can't right now because lots of people are editing.

> libxml2 does not, but lxml does.

That's okay, I don't think many people seriously use python-libxml2
anyway (using lxml instead) :-) Again, not sure what that would be an
alternative for though?

> Also, under where it mentions that most OS's do not include Python 3, it
> should be noted which have good support for it.? Gentoo (for example) has
> excellent support for Python 3, automatically installing Python packages
> which have Py3 support for both Py2 and Py3, and the python-based Portage
> package system runs cleanly on Py2.6, Py3.1 and Py3.2.

As Barry has pointed out 3.x is in many distros now, so in order to
not make people angry that their distro who also does the Right Thing
isn't mentioned (what's Arch do? py3k is easily available from AUR,
that's not really ArchLinux proper but every Arch user I've ever
talked to considers AUR an integral part), I added this:
"""
Also, quite a few distributions have Python 3.x available already for
end-users, even if it's not the default interpreter.
"""
I think that would make everyone happy, and the wiki article that much
more maintainable.


Thanks for your input,
Laurens

From lvh at laurensvh.be  Mon Jun 21 17:39:04 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 21 Jun 2010 17:39:04 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>
	<201006211113.06767.stephan.richter@gmail.com>
	<AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>
Message-ID: <AANLkTilGK08cIOB2uDe4Go92CTEO0j1eQ4TpwVltC61t@mail.gmail.com>

On Mon, Jun 21, 2010 at 5:28 PM, Toshio Kuratomi <a.badger at gmail.com> wrote:
> <nod> Fedora 14 is about the same. ?A nice to have thing that goes along
> with these would be a table that has packages ported to python3 and which
> distributions have the python3 version of the package.

Yeah, this is exactly why I'd prefer to not have to maintain a
specific list. Big distros are making Python 3.x available, it's not
the default interpreter yet anywhere (AFAIK?), but that's going to
happen in the next few releases of said distributions.

On Mon, Jun 21, 2010 at 5:31 PM, Arc Riley <arcriley at gmail.com> wrote:
> Personally, I'd like to celebrate the upcoming Python 3.2 release (which
> will hopefully include 3to2) with moving all packages which do not have the
> 'Programming Language :: Python :: 3' classifier to a "Legacy" section of
> PyPI and offer only Python 3 packages otherwise.? Of course put a banner at
> the top clearly explaining that Python 2 packages can be found in the Legacy
> section.
>
> Radical, I know, but at some point we really need to make this move.

I agree we have to make it at some point but I feel this is way, way too early.

thanks for your continued input,
Laurens

From barry at python.org  Mon Jun 21 17:43:07 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 11:43:07 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
Message-ID: <20100621114307.48735698@heresy>

On Jun 21, 2010, at 10:20 PM, Nick Coghlan wrote:

>Something that may make sense to ease the porting process is for some
>of these "on the boundary" I/O related string manipulation functions
>(such as os.path.join) to grow "encoding" keyword-only arguments. The
>recommended approach would be to provide all strings, but bytes could
>also be accepted if an encoding was specified. (If you want to mix
>encodings - tough, do the decoding yourself).

This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz
for it.

Would it make sense to have "encoding-carrying" bytes and str types?
Basically, I'm thinking of types (maybe even the current ones) that carry
around a .encoding attribute so that they can be automatically encoded and
decoded where necessary.  This at least would simplify APIs that need to do
the conversion.

By default, the .encoding attribute would be some marker to indicated "I have
no idea, do it explicitly" and if you combine ebytes or estrs that have
incompatible encodings, you'd either throw an exception or reset the .encoding
to IAmConfuzzled.  But say you had an email header like:

=?euc-jp?b?pc+l7aG8pe+hvKXrpcmhqg==?=

And code like the following (made less crappy):

-----snip snip-----
class ebytes(bytes):
    encoding = 'ascii'

    def __str__(self):
        s = estr(self.decode(self.encoding))
        s.encoding = self.encoding
        return s


class estr(str):
    encoding = 'ascii'


s = str(b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa', 'euc-jp')
b = bytes(s, 'euc-jp')

eb = ebytes(b)
eb.encoding = 'euc-jp'
es = str(eb)
print(repr(eb), es, es.encoding)
-----snip snip-----

Running this you get:

b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa' ???????? euc-jp

Would it be feasible?  Dunno.  Would it help ease the bytes/str confusion?
Dunno.  But I think it would help make APIs easier to design and use because
it would cut down on the encoding-keyword function signature infection.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/10fd5d0f/attachment.pgp>

From murman at gmail.com  Mon Jun 21 18:03:30 2010
From: murman at gmail.com (Michael Urman)
Date: Mon, 21 Jun 2010 11:03:30 -0500
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621145133.7F5333A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
Message-ID: <AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>

On Mon, Jun 21, 2010 at 09:51, P.J. Eby <pje at telecommunity.com> wrote:
> The issue is, I'd like to have an idempotent incantation that I can use to
> make the inputs and outputs to stdlib functions behave in a type-safe manner
> with respect to bytes, in cases where bytes are really what I want operated
> on.
>
> Note too that this is an argument for symmetry in wrapping the inputs and
> outputs, so that the code doesn't have to "know" what it's dealing with!

It is somewhat troublesome that there doesn't appear to be an obvious
built-in idempotent-when-possible function that gives back the
provided bytes/str, or converts to the requested type per the listed
encoding (as of 3.1.2). Would it be useful to make the second versions
of these work, or would that return us to the confusion of the 2.x
era? On the other hand, since these are all TypeErrors instead of
UnicodeErrors, it's an easy wrapper to write.

    >>> bytes('abc', 'latin-1')
    b'abc'
    >>> bytes(b'abc', 'latin-1')
    TypeError: encoding or errors without a string argument

    >>> str(b'abc', 'latin-1')
    'abc'
    >>> str('abc', 'latin-1')
    TypeError: decoding str is not supported

Interestingly the online docs for str say it can decode either a byte
string or a character buffer, a term which doesn't yield a definition
in a search; apparently either a string is not a character buffer, or
the docs are incorrect.
http://docs.python.org/py3k/library/functions.html?highlight=str#str

However it looks like this is consistent with int.
    >>> int(4, 0)
    TypeError: int() can't convert non-string with explicit base

-- 
Michael Urman

From stephen at xemacs.org  Mon Jun 21 18:08:53 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 01:08:53 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
Message-ID: <87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>

Lennart Regebro writes:

 > 2010/6/21 Stephen J. Turnbull <stephen at xemacs.org>:
 > > IMO, the UI is right. ?"Something" like the above "ought" to work.
 > 
 > Right. That said, many times when you want to do urlparse etc they
 > might be binary, and you might want binary. So maybe the methods
 > should work with both?

First, a caveat: I'm a Unicode/encodings person, not an experienced
web programmer.  My opinions on whether this would work well in
practice should be taken with a grain of salt.

Speaking for myself, I live in a country where the natives have
saddled themselves with no less than 4 encodings in common use, and I
would never want "binary" since none of them would display as anything
useful in a traceback.  Wherever possible, I decode "blobs" into
structured objects, I do it as soon as possible, and if for efficiency
reasons I want to do this lazily, I store the blob in a separate
.raw_object attribute.  If they're textual, I decode them to text.  I
can't see an efficiency argument for decoding URIs lazily in most
applications.

In the case of structured text like URIs, I would create a separate
class for handling them with string-like operations.  Internally, all
text would be raw Unicode (ie, not url-encoded); repr(uri) would use
some kind of readable quoting convention (not url-encoding) to
disambiguate random reserved characters from separators, while
str(uri) would produce an url-encoded string.  Converting to and from
wire format is just .encode and .decode, then, and in this country you
need to be flexible about which encoding you use.

Agreed, this stuff is really annoying.  But I think that just comes
with the territory.  PJE reports that folks don't like doing encoding
and decoding all over the place.  I understand that, but if they're
doing a lot of that, I have to wonder why.  Why not define the one
line function and get on with life?

The thing is, where I live, it's not going to be a one line function.
I'm going to be dealing with URLs that are url-encoded representations
of UTF-8, Shift-JIS, EUC-JP, and occasionally RFC 2047!  So I need an
API that explicitly encodes and decodes.  And I need an API that
presents Japanese as Japanese rather than as line noise.

Eg, PJE writes

    Ugh.  I meant: 

    newurl = urljoin(str(base, 'latin-1'), 'subdir').encode('latin-1')

    Which just goes to the point of how ridiculous it is to have to  
    convert things to strings and back again to use APIs that ought to  
    just handle bytes properly in the first place. 

But if you need that "everywhere", what's so hard about

def urljoin_wrapper (base, subdir):
    return urljoin(str(base, 'latin-1'), subdir).encode('latin-1')

Now, note how that pattern fails as soon as you want to use
non-ISO-8859-1 languages for subdir names.  In Python 3, the code
above is just plain buggy, IMHO.  The original author probably will
never need the generalization.  But her name will be cursed unto the
nth generation by people who use her code on a different continent.

The net result is that bytes are *not* a programmer- or user-friendly
way to do this, except for the minority of the world for whom Latin-1
is a good approximation to their daily-use unibyte encoding (eg, it's
probably usable for debugging in Dansk, but you won't win any
popularity contests in Tel Aviv or Shanghai).


From tjreedy at udel.edu  Mon Jun 21 18:23:18 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 12:23:18 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>	<hvm6cu$gaq$1@dough.gmane.org>	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>
	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
Message-ID: <hvo3ln$55n$1@dough.gmane.org>

On 6/21/2010 8:33 AM, Nick Coghlan wrote:

> P.S. (We're going to have a tough decision to make somewhere along the
> line where docs.python.org is concerned, too - when do we flick the
> switch and make a 3.x version of the docs the default?

Easy. When 3.2 is released. When 2.7 is released, 3.2 becomes 'trunk'. 
Trunk released always take over docs.python.org. To do otherwise would 
be to say that 3.2 is not a real trunk release and not yet ready for 
real use -- a major slam.

Actually, I thought this was already discussed and decided ;-).

Terry Jan Reedy


From a.badger at gmail.com  Mon Jun 21 18:34:04 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 12:34:04 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621114307.48735698@heresy>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
Message-ID: <20100621163404.GV5787@unaka.lan>

On Mon, Jun 21, 2010 at 11:43:07AM -0400, Barry Warsaw wrote:
> On Jun 21, 2010, at 10:20 PM, Nick Coghlan wrote:
> 
> >Something that may make sense to ease the porting process is for some
> >of these "on the boundary" I/O related string manipulation functions
> >(such as os.path.join) to grow "encoding" keyword-only arguments. The
> >recommended approach would be to provide all strings, but bytes could
> >also be accepted if an encoding was specified. (If you want to mix
> >encodings - tough, do the decoding yourself).
> 
> This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz
> for it.
> 
> Would it make sense to have "encoding-carrying" bytes and str types?
> Basically, I'm thinking of types (maybe even the current ones) that carry
> around a .encoding attribute so that they can be automatically encoded and
> decoded where necessary.  This at least would simplify APIs that need to do
> the conversion.
> 
> By default, the .encoding attribute would be some marker to indicated "I have
> no idea, do it explicitly" and if you combine ebytes or estrs that have
> incompatible encodings, you'd either throw an exception or reset the .encoding
> to IAmConfuzzled.  But say you had an email header like:
> 
> =?euc-jp?b?pc+l7aG8pe+hvKXrpcmhqg==?=
> 
> And code like the following (made less crappy):
> 
> -----snip snip-----
> class ebytes(bytes):
>     encoding = 'ascii'
> 
>     def __str__(self):
>         s = estr(self.decode(self.encoding))
>         s.encoding = self.encoding
>         return s
> 
> 
> class estr(str):
>     encoding = 'ascii'
> 
> 
> s = str(b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa', 'euc-jp')
> b = bytes(s, 'euc-jp')
> 
> eb = ebytes(b)
> eb.encoding = 'euc-jp'
> es = str(eb)
> print(repr(eb), es, es.encoding)
> -----snip snip-----
> 
> Running this you get:
> 
> b'\xa5\xcf\xa5\xed\xa1\xbc\xa5\xef\xa1\xbc\xa5\xeb\xa5\xc9\xa1\xaa' ???????? euc-jp
> 
> Would it be feasible?  Dunno.  Would it help ease the bytes/str confusion?
> Dunno.  But I think it would help make APIs easier to design and use because
> it would cut down on the encoding-keyword function signature infection.
> 
I like the idea of having encoding information carried with the data.
I don't think that an ebytes type that can *optionally* have an encoding
attribute makes the situation less confusing, though.  To me the biggest
problem with python-2.x's unicode/bytes handling was not that it threw
exceptions but that it didn't always throw exceptions.  You might test this
in python2::
    t = u'cafe'
    function(t)

And say, ah my code works.  Then a user gives it this::
    t = u'caf?'
    function(t)

And get a unicode error because the function only works with unicode in the
ascii range.

ebytes seems to have the same pitfall where the code path exercised by your
tests could work with::
    eb = ebytes(b)
    eb.encoding = 'euc-jp'
    function(eb)

but the user exercises a code path that does this and fails::
    eb = ebytes(b)
    function(eb)

What do you think of making the encoding attribute a mandatory part of
creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/27ffdbf0/attachment.pgp>

From tjreedy at udel.edu  Mon Jun 21 18:35:08 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 12:35:08 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>
	<201006211113.06767.stephan.richter@gmail.com>
	<AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>
Message-ID: <hvo4bt$833$1@dough.gmane.org>

On 6/21/2010 11:31 AM, Arc Riley wrote:
> Personally, I'd like to celebrate the upcoming Python 3.2 release (which
> will hopefully include 3to2) with moving all packages which do not have
> the 'Programming Language :: Python :: 3' classifier to a "Legacy"
> section of PyPI and offer only Python 3 packages otherwise.  Of course
> put a banner at the top clearly explaining that Python 2 packages can be
> found in the Legacy section.

I do not think 2.x should be dissed any more than 3.x, which is to say, 
not at all. The impression I got from lurking on #python last night, in 
between disconnects, is that at least a couple of people feel that there 
is a move afoot to push people to Python3. Whether that had any 
connection to discussions here, I could not tell.

Having pypi.python.org/py2 and pypi.python.org/py3 though might be a 
good idea. Inquiries from either url would automatically filter. The 
counterargument is that there may be people looking for packages 
available for *both*.

> Radical, I know, but at some point we really need to make this move.
>
> PyPI really needs a mechanism to cull out the moribund packages from
> being displayed next to the actively maintained ones.

The default ordering for search results is by rating.

   There's so many
> packages on there that only work on Python 2.2-2.4 (for example), or
> with a specific highly outdated version of another package, etc.

And there are people running those versions. I think better 
classification and filtering is the answer, though hard to mandate.

Terry Jan Reedy


From pje at telecommunity.com  Mon Jun 21 18:46:44 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 12:46:44 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.c
 om>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
Message-ID: <20100621164650.16A093A414B@sparrow.telecommunity.com>

At 10:51 PM 6/21/2010 +1000, Nick Coghlan wrote:
>It may be that there are places where we need to rewrite standard
>library algorithms to be bytes/str neutral (e.g. by using length one
>slices instead of indexing). It may be that there are more APIs that
>need to grow "encoding" keyword arguments that they then pass on to
>the functions they call or use to convert str arguments to bytes (or
>vice-versa). But without people trying to port affected libraries and
>reporting bugs when they find issues, the situation isn't going to
>improve.
>
>Now, if these bugs are already being reported against 3.1 and just
>aren't getting fixed, that's a completely different story...

The overall impression, though, is that this isn't really a step 
forward.  Now, bytes are the special case instead of unicode, but 
that special case isn't actually handled any better by the stdlib - 
in fact, it's arguably worse.  And, the burden of addressing this 
seems to have been shifted from the people who made the change, to 
the people who are going to use it.  But those people are not 
necessarily in a position to tell you anything more than, "give me 
something that works with bytes".

What I can tell you is that before, since string constants in the 
stdlib were ascii bytes, and transparently promoted to unicode, 
stdlib behavior was *predictable* in the presence of special cases: 
you got back either bytes or unicode, but either way, you could 
idempotently upgrade the result to unicode, or just pass it on.  APIs 
were "str safe, unicode aware".  If you passed in bytes, you weren't 
going to get unicode without a warning, and if you passed in unicode, 
it'd work and you'd get unicode back.

Now, the APIs are neither safe nor aware -- if you pass bytes in, you 
get unpredictable results back.

Ironically, it almost *would* have been better if bytes simply didn't 
work as strings at all, *ever*, but if you could wrap them with a 
bstr() to *treat* them as text.  You could still have restrictions on 
combining them, as long as it was a restriction on the unicode you 
mixed with them.  That is, if you could combine a bstr and a str if 
the *str* was restricted to ASCII.

If we had the Python 3 design discussions to do over again, I think I 
would now have stuck with the position of not letting bytes be 
string-compatible at all, and instead proposed an explicit bstr() 
wrapper/adapter to use them as strings, that would (in that case) 
force coercion in the direction of bytes rather than strings.  (And 
bstr need not have been a builtin - it could have been something you 
import, to help discourage casual usage.)

Might this approach lead to some people doing things wrong in the 
case of porting?  Sure.  But there'd be little reason to use it in 
new code that didn't have a real need for bytestring manipulation.

It might've been a better balance between practicality and purity, in 
that it keeps the language pure, while offering a practical way to 
deal with things in bytes if you really need to.  And, bytes wouldn't 
silently succeed *some* of the time, leading to a trap.  An easy 
inconsistency is worse than a bit of uniform chicken-waving.

Is it too late to make that tradeoff?  Probably.  Certainly it's not 
practical to *implement* outside the language core, and removing 
string methods would fux0r anybody whose currently-ported code relies 
on bytes objects having string-like methods.


From fuzzyman at voidspace.org.uk  Mon Jun 21 18:49:55 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 21 Jun 2010 17:49:55 +0100
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621164650.16A093A414B@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
	<20100621164650.16A093A414B@sparrow.telecommunity.com>
Message-ID: <4C1F9833.2080905@voidspace.org.uk>

On 21/06/2010 17:46, P.J. Eby wrote:
> At 10:51 PM 6/21/2010 +1000, Nick Coghlan wrote:
>> It may be that there are places where we need to rewrite standard
>> library algorithms to be bytes/str neutral (e.g. by using length one
>> slices instead of indexing). It may be that there are more APIs that
>> need to grow "encoding" keyword arguments that they then pass on to
>> the functions they call or use to convert str arguments to bytes (or
>> vice-versa). But without people trying to port affected libraries and
>> reporting bugs when they find issues, the situation isn't going to
>> improve.
>>
>> Now, if these bugs are already being reported against 3.1 and just
>> aren't getting fixed, that's a completely different story...
>
> The overall impression, though, is that this isn't really a step 
> forward. Now, bytes are the special case instead of unicode, but that 
> special case isn't actually handled any better by the stdlib - in 
> fact, it's arguably worse. And, the burden of addressing this seems to 
> have been shifted from the people who made the change, to the people 
> who are going to use it. But those people are not necessarily in a 
> position to tell you anything more than, "give me something that works 
> with bytes".
>
> What I can tell you is that before, since string constants in the 
> stdlib were ascii bytes, and transparently promoted to unicode, stdlib 
> behavior was *predictable* in the presence of special cases: you got 
> back either bytes or unicode, but either way, you could idempotently 
> upgrade the result to unicode, or just pass it on. APIs were "str 
> safe, unicode aware". If you passed in bytes, you weren't going to get 
> unicode without a warning, and if you passed in unicode, it'd work and 
> you'd get unicode back.
>
> Now, the APIs are neither safe nor aware -- if you pass bytes in, you 
> get unpredictable results back.
>
> Ironically, it almost *would* have been better if bytes simply didn't 
> work as strings at all, *ever*, but if you could wrap them with a 
> bstr() to *treat* them as text. You could still have restrictions on 
> combining them, as long as it was a restriction on the unicode you 
> mixed with them. That is, if you could combine a bstr and a str if the 
> *str* was restricted to ASCII.
>
> If we had the Python 3 design discussions to do over again, I think I 
> would now have stuck with the position of not letting bytes be 
> string-compatible at all, and instead proposed an explicit bstr() 
> wrapper/adapter to use them as strings, that would (in that case) 
> force coercion in the direction of bytes rather than strings. (And 
> bstr need not have been a builtin - it could have been something you 
> import, to help discourage casual usage.)
>
> Might this approach lead to some people doing things wrong in the case 
> of porting? Sure. But there'd be little reason to use it in new code 
> that didn't have a real need for bytestring manipulation.
>
> It might've been a better balance between practicality and purity, in 
> that it keeps the language pure, while offering a practical way to 
> deal with things in bytes if you really need to. And, bytes wouldn't 
> silently succeed *some* of the time, leading to a trap. An easy 
> inconsistency is worse than a bit of uniform chicken-waving.
>
> Is it too late to make that tradeoff? Probably. Certainly it's not 
> practical to *implement* outside the language core, and removing 
> string methods would fux0r anybody whose currently-ported code relies 
> on bytes objects having string-like methods.
>

Why is your proposed bstr wrapper not practical to implement outside the 
core and use in your own libraries and frameworks?

Michael

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk 
>


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From pje at telecommunity.com  Mon Jun 21 18:54:53 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 12:54:53 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100621165506.26D4C3A404D@sparrow.telecommunity.com>

At 01:08 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
>But if you need that "everywhere", what's so hard about
>
>def urljoin_wrapper (base, subdir):
>     return urljoin(str(base, 'latin-1'), subdir).encode('latin-1')
>
>Now, note how that pattern fails as soon as you want to use
>non-ISO-8859-1 languages for subdir names.

Bear in mind that the use cases I'm talking about here are WSGI 
stacks with components written by multiple authors -- each of whom 
may have to define that function, and still get it right.

Sure, there are some things that could go in wsgiref in the 
stdlib.  However, as of this moment, there's only a very uneasy rough 
consensus in Web-Sig as to how the heck WSGI should actually *work* 
on Python 3, because of issues like these.

That makes it tough to actually say what should happen in the stdlib 
-- e.g., which things should be classed as stdlib bugs, which things 
should be worked around with wrappers or new functions, etc.


From benjamin at python.org  Mon Jun 21 19:14:09 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 21 Jun 2010 12:14:09 -0500
Subject: [Python-Dev] [RELEASED] Python 2.7 release candidate 2
Message-ID: <AANLkTimV7AaEn76BdRS3cAV1SyQ6gKmuLlptJRmMaFK7@mail.gmail.com>

On behalf of the Python development team, I'm tickled pink to announce the
second release candidate of Python 2.7.

Python 2.7 is scheduled (by Guido and Python-dev) to be the last major version
in the 2.x series. However, 2.7 will have an extended period of bugfix
maintenance.

2.7 includes many features that were first released in Python 3.1. The faster io
module, the new nested with statement syntax, improved float repr, set literals,
dictionary views, and the memoryview object have been backported from 3.1. Other
features include an ordered dictionary implementation, unittests improvements, a
new sysconfig module, auto-numbering of fields in the str/unicode format method,
and support for ttk Tile in Tkinter.  For a more extensive list of changes in
2.7, see http://doc.python.org/dev/whatsnew/2.7.html or Misc/NEWS in the Python
distribution.

To download Python 2.7 visit:

     http://www.python.org/download/releases/2.7/

While this is a preview release and is thus not suitable for production use, we
strongly encourage Python application and library developers to test the release
with their code and report any bugs they encounter to:

     http://bugs.python.org/

This helps ensure that those upgrading to Python 2.7 will encounter as few bumps
as possible.

2.7 documentation can be found at:

     http://docs.python.org/2.7/


Enjoy!

--
Benjamin Peterson
Release Manager
benjamin at python.org
(on behalf of the entire python-dev team and 2.7's contributors)

From pje at telecommunity.com  Mon Jun 21 19:17:57 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 13:17:57 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621114307.48735698@heresy>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
Message-ID: <20100621171803.B35C33A414B@sparrow.telecommunity.com>

At 11:43 AM 6/21/2010 -0400, Barry Warsaw wrote:
>On Jun 21, 2010, at 10:20 PM, Nick Coghlan wrote:
> >Something that may make sense to ease the porting process is for some
> >of these "on the boundary" I/O related string manipulation functions
> >(such as os.path.join) to grow "encoding" keyword-only arguments. The
> >recommended approach would be to provide all strings, but bytes could
> >also be accepted if an encoding was specified. (If you want to mix
> >encodings - tough, do the decoding yourself).
>
>This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz
>for it.
>
>Would it make sense to have "encoding-carrying" bytes and str types?

It's not a stupid idea, and could potentially work.  It also might 
have a better chance of being able to actually be *implemented* in 
3.x than my idea.

>Basically, I'm thinking of types (maybe even the current ones) that carry
>around a .encoding attribute so that they can be automatically encoded and
>decoded where necessary.  This at least would simplify APIs that need to do
>the conversion.

I'm not really sure how much use the encoding is on a unicode object 
- what would it actually mean?

Hm. I suppose it would effectively mean "this string can be 
represented in this encoding" -- which is useful, in that you could 
fail operations when combining with bytes of a different encoding.

Hm... no, in that case you should just encode the string to the 
bytes' encoding, and let that throw an error if it fails.  So, 
really, there's no reason for a string to know its encoding.  All you 
need is the bytes type to have an encoding attribute, and when doing 
mixed-type operations between bytes and strings, coerce to *bytes of 
the same encoding*.

However, if .encoding is None, then coercion would follow the same 
rules as now -- i.e., convert the bytes to unicode, assuming an ascii 
encoding.  (This would be different than setting an encoding of 
'ascii', because in that case, it means you want cross-type 
operations to result in ascii bytes, rather than a  unicode string, 
and to fail if the unicode part can't be encoded appropriately.  The 
'None' setting is effectively a nod to compatibility with prior 3.x 
versions, since I assume we can't just throw out the old coercion behavior.)

Then, a few more changes to the bytes type would round out the implementation:

* Allow .decode() to not specify an encoding, unless .encoding is None

* Add back in the missing string methods (e.g. .encode()), since you 
can transparently upgrade to a string)

* Smart __str__, as shown in your proposal.


>Would it be feasible?  Dunno.

Probably, although it might mean adding back in special cases that 
were previously taken out, and a few new ones.


>   Would it help ease the bytes/str confusion?  Dunno.

Not sure what confusion you mean -- Web-SIG and I at least are not 
confused about the difference between bytes and str, or we wouldn't 
be having an issue.  ;-)  Or maybe you mean the stdlib's API 
confusion?  In which case, yes, definitely!


>   But I think it would help make APIs easier to design and use because
>it would cut down on the encoding-keyword function signature infection.

Not only that, but I believe it would also retroactively make the 
stdlib's implementation of those APIs "correct" again, and give us 
One Obvious Way to work with bytes of a known encoding, while 
constraining any unicode that gets combined with those bytes to be 
validly encodable.  It also gives you an idempotent constructor for 
bytes of a specified encoding, that can take either a bytes of 
unspecified encoding, a bytes of the correct encoding, or a string 
that can be encoded as such.

In short, +1.  (I wish it were possible to go back and make bytes 
non-strings and have only this ebytes or bstr or whatever type have 
string methods, but I'm pretty sure that ship has already sailed.)


From pje at telecommunity.com  Mon Jun 21 19:24:10 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 13:24:10 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621163404.GV5787@unaka.lan>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
Message-ID: <20100621172413.578853A404D@sparrow.telecommunity.com>

At 12:34 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
>What do you think of making the encoding attribute a mandatory part of
>creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).

As long as the coercion rules force str+ebytes (or str % ebytes, 
ebytes % str, etc.) to result in another ebytes (and fail if the str 
can't be encoded in the ebytes' encoding), I'm personally fine with 
it, although I really like the idea of tacking the encoding to bytes 
objects in the first place.

OTOH, one potential problem with having the encoding on the bytes 
object rather than the ebytes object is that then you can't easily 
take bytes from a socket and then say what encoding they are, without 
interfering with the sockets API (or whatever other place you get the 
bytes from).

So, on balance, making ebytes a separate type (perhaps one that's 
just a pointer to the bytes and a pointer to the encoding) would 
indeed make more sense.  It having different coercion rules for 
interacting with strings would make more sense too in that 
case.  (The ideal, of course, would still be to not let bytes objects 
be stringlike at all, with only ebytes acting string-like.  That way, 
you'd be forced to be explicit about your encoding when working with 
bytes, but all you'd need to do was make an ebytes call.)


From a.badger at gmail.com  Mon Jun 21 18:56:11 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 12:56:11 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100621165611.GW5787@unaka.lan>

On Tue, Jun 22, 2010 at 01:08:53AM +0900, Stephen J. Turnbull wrote:
> Lennart Regebro writes:
> 
>  > 2010/6/21 Stephen J. Turnbull <stephen at xemacs.org>:
>  > > IMO, the UI is right. ?"Something" like the above "ought" to work.
>  > 
>  > Right. That said, many times when you want to do urlparse etc they
>  > might be binary, and you might want binary. So maybe the methods
>  > should work with both?
> 
> First, a caveat: I'm a Unicode/encodings person, not an experienced
> web programmer.  My opinions on whether this would work well in
> practice should be taken with a grain of salt.
> 
> Speaking for myself, I live in a country where the natives have
> saddled themselves with no less than 4 encodings in common use, and I
> would never want "binary" since none of them would display as anything
> useful in a traceback.  Wherever possible, I decode "blobs" into
> structured objects, I do it as soon as possible, and if for efficiency
> reasons I want to do this lazily, I store the blob in a separate
> .raw_object attribute.  If they're textual, I decode them to text.  I
> can't see an efficiency argument for decoding URIs lazily in most
> applications.
> 
> In the case of structured text like URIs, I would create a separate
> class for handling them with string-like operations.  Internally, all
> text would be raw Unicode (ie, not url-encoded); repr(uri) would use
> some kind of readable quoting convention (not url-encoding) to
> disambiguate random reserved characters from separators, while
> str(uri) would produce an url-encoded string.  Converting to and from
> wire format is just .encode and .decode, then, and in this country you
> need to be flexible about which encoding you use.
> 
> Agreed, this stuff is really annoying.  But I think that just comes
> with the territory.  PJE reports that folks don't like doing encoding
> and decoding all over the place.  I understand that, but if they're
> doing a lot of that, I have to wonder why.  Why not define the one
> line function and get on with life?
> 
> The thing is, where I live, it's not going to be a one line function.
> I'm going to be dealing with URLs that are url-encoded representations
> of UTF-8, Shift-JIS, EUC-JP, and occasionally RFC 2047!  So I need an
> API that explicitly encodes and decodes.  And I need an API that
> presents Japanese as Japanese rather than as line noise.
> 
> Eg, PJE writes
> 
>     Ugh.  I meant: 
> 
>     newurl = urljoin(str(base, 'latin-1'), 'subdir').encode('latin-1')
> 
>     Which just goes to the point of how ridiculous it is to have to  
>     convert things to strings and back again to use APIs that ought to  
>     just handle bytes properly in the first place. 
> 
> But if you need that "everywhere", what's so hard about
> 
> def urljoin_wrapper (base, subdir):
>     return urljoin(str(base, 'latin-1'), subdir).encode('latin-1')
> 
> Now, note how that pattern fails as soon as you want to use
> non-ISO-8859-1 languages for subdir names.  In Python 3, the code
> above is just plain buggy, IMHO.  The original author probably will
> never need the generalization.  But her name will be cursed unto the
> nth generation by people who use her code on a different continent.
> 
> The net result is that bytes are *not* a programmer- or user-friendly
> way to do this, except for the minority of the world for whom Latin-1
> is a good approximation to their daily-use unibyte encoding (eg, it's
> probably usable for debugging in Dansk, but you won't win any
> popularity contests in Tel Aviv or Shanghai).
> 
One comment here -- you can also have uri's that aren't decodable into their
true textual meaning using a single encoding.

Apache will happily serve out uris that have utf-8, shift-jis, and euc-jp
components inside of their path but the textual representation that was intended
will be garbled (or be represented by escaped byte sequences).  For that
matter, apache will serve requests that have no true textual representation
as it is working on the byte level rather than the character level.

So a complete solution really should allow the programmer to pass in uris as
bytes when the programmer knows that they need it.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/8922f0e9/attachment-0001.pgp>

From tjreedy at udel.edu  Mon Jun 21 19:27:30 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 13:27:30 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C1EE2E1.5030105@udel.edu>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<hvm8gk$tjq$1@dough.gmane.org>	<20100621013405.19DC33A4099@sparrow.telecommunity.com>
	<4C1EE2E1.5030105@udel.edu>
Message-ID: <hvo7e3$og3$1@dough.gmane.org>

On 6/20/2010 11:56 PM, Terry Reedy wrote:

> The specific example is
>
>  >>> urllib.parse.parse_qsl('a=b%e0')
> [('a', 'b?')]
>
> where the character after 'b' is white ? in dark diamond, indicating an
> error.
>
> parse_qsl() splits that input on '=' and sends each piece to
> urllib.parse.unquote
> unquote() attempts to "Replace %xx escapes by their single-character
> equivalent.". unquote has an encoding parameter that defaults to 'utf-8'
> in *its* call to .decode. parse_qsl does not have an encoding parameter.
> If it did, and it passed that to unquote, then
> the above example would become (simulated interaction)
>
>  >>> urllib.parse.parse_qsl('a=b%e0', encoding='latin-1')
> [('a', 'b?')]
>
> I got that output by copying the file and adding "encoding-'latin-1'" to
> the unquote call.
>
> Does this solve this problem?
> Has anything like this been added for 3.2?
> Should it be?

With a little searching, I found
http://bugs.python.org/issue5468
with Miles Kaufmann's year-old comment "parse_qs and parse_qsl should 
also grow encoding and errors parameters to pass to the underlying 
unquote()". Patch review is needed.

Terry Jan Reedy


From stephan.richter at gmail.com  Mon Jun 21 17:13:06 2010
From: stephan.richter at gmail.com (Stephan Richter)
Date: Mon, 21 Jun 2010 11:13:06 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>
Message-ID: <201006211113.06767.stephan.richter@gmail.com>

On Monday, June 21, 2010, Nick Coghlan wrote:
> A decent listing of major packages that already support Python 3 would
> be very handy for the new Python2orPython3 page I created on the wiki,
> and easier to keep up-to-date. (the old Early2to3Migrations page
> didn't look particularly up to date, but hopefully we can keep the new
> list in a happier state).

I really just want to be able to go to PyPI, Click on "Browse packages" and 
then select "Python 3" (it can currently be accomplished by clicking "Python" 
and then  "3"). Of course, package developers need to be encouraged to add 
these Trove classifiers so that the listings are as complete as possible.

Regards,
Stephan
-- 
Entrepreneur and Software Geek
Google me. "Zope Stephan Richter"

From guido at python.org  Mon Jun 21 19:29:27 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 21 Jun 2010 10:29:27 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621164650.16A093A414B@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com> 
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com> 
	<20100621164650.16A093A414B@sparrow.telecommunity.com>
Message-ID: <AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>

On Mon, Jun 21, 2010 at 9:46 AM, P.J. Eby <pje at telecommunity.com> wrote:
> At 10:51 PM 6/21/2010 +1000, Nick Coghlan wrote:
>>
>> It may be that there are places where we need to rewrite standard
>> library algorithms to be bytes/str neutral (e.g. by using length one
>> slices instead of indexing). It may be that there are more APIs that
>> need to grow "encoding" keyword arguments that they then pass on to
>> the functions they call or use to convert str arguments to bytes (or
>> vice-versa). But without people trying to port affected libraries and
>> reporting bugs when they find issues, the situation isn't going to
>> improve.
>>
>> Now, if these bugs are already being reported against 3.1 and just
>> aren't getting fixed, that's a completely different story...
>
> The overall impression, though, is that this isn't really a step forward.
> ?Now, bytes are the special case instead of unicode, but that special case
> isn't actually handled any better by the stdlib - in fact, it's arguably
> worse. ?And, the burden of addressing this seems to have been shifted from
> the people who made the change, to the people who are going to use it. ?But
> those people are not necessarily in a position to tell you anything more
> than, "give me something that works with bytes".
>
> What I can tell you is that before, since string constants in the stdlib
> were ascii bytes, and transparently promoted to unicode, stdlib behavior was
> *predictable* in the presence of special cases: you got back either bytes or
> unicode, but either way, you could idempotently upgrade the result to
> unicode, or just pass it on. ?APIs were "str safe, unicode aware". ?If you
> passed in bytes, you weren't going to get unicode without a warning, and if
> you passed in unicode, it'd work and you'd get unicode back.

Actually, the big problem with Python 2 is that if you mix str and
unicode, things work or crash depending on whether any of the str
objects involved contain non-ASCII bytes.

If one API decides to upgrade to Unicode, the result, when passed to
another API, may well cause a UnicodeError because not all arguments
have had the same treatment.

> Now, the APIs are neither safe nor aware -- if you pass bytes in, you get
> unpredictable results back.

This seems an overgeneralization of a particular bug. There are APIs
that are strictly text-in, text-out. There are others that are
bytes-in, bytes-out. Let's call all those *pure*. For some operations
it makes sense that the API is *polymorphic*, with which I mean that
text-in causes text-out, and bytes-in causes byte-out. All of these
are fine.

Perhaps there are more situations where a polymorphic API would be
helpful. Such APIs are not always so easy to implement, because they
have to be careful with literals or other constants (and even more so
mutable state) used internally -- but it can be done, and there are
plenty of examples in the stdlib.

The real problem apparently lies in (what I believe is only a few
rare) APIs that are text-or-bytes-in and always-text-out (or
always-bytes-out). Let's call them *hybrid*. Clearly, mixing hybrid
APIs in a stream of pure or polymorphic API calls is a problem,
because they turn a pure or polymorphic overall operation into a
hybrid one.

There are also text-in, bytes-out or bytes-in, text-out APIs that are
intended for encoding/decoding of course, but these are in a totally
different class.

Abstractly, it would be good if there were as few as possible hybrid
APIs, many pure or polymorphic APIs (which it should be in a
particular case is a pragmatic choice), and a limited number of
encoding/decoding APIs, which should generally be invoked at the edges
of the program (e.g., I/O).

> Ironically, it almost *would* have been better if bytes simply didn't work
> as strings at all, *ever*, but if you could wrap them with a bstr() to
> *treat* them as text. ?You could still have restrictions on combining them,
> as long as it was a restriction on the unicode you mixed with them. ?That
> is, if you could combine a bstr and a str if the *str* was restricted to
> ASCII.

ISTR that we considered something like this and decided to stay away
from it. At this point I think that a successful 3rd party bstr
implementation would be required before we rush to add one to the
stdlib.

> If we had the Python 3 design discussions to do over again, I think I would
> now have stuck with the position of not letting bytes be string-compatible
> at all,

They aren't, unless you consider the presence of some methods with
similar behavior (.lower(), .split() and so on) and the existence of
some polymorphic APIs (see above) as "compatibility".

> and instead proposed an explicit bstr() wrapper/adapter to use them
> as strings, that would (in that case) force coercion in the direction of
> bytes rather than strings. ?(And bstr need not have been a builtin - it
> could have been something you import, to help discourage casual usage.)

I'm stil unclear on exactly what bstr is supposed to be, but it sounds
a bit like one of the rejected proposals for having a single
(Unicode-capable) str type that is implemented using different width
encodings (Latin-1, UCS-2, UCS-4) underneath.

> Might this approach lead to some people doing things wrong in the case of
> porting? ?Sure. ?But there'd be little reason to use it in new code that
> didn't have a real need for bytestring manipulation.
>
> It might've been a better balance between practicality and purity, in that
> it keeps the language pure, while offering a practical way to deal with
> things in bytes if you really need to. ?And, bytes wouldn't silently succeed
> *some* of the time, leading to a trap. ?An easy inconsistency is worse than
> a bit of uniform chicken-waving.

I still believe that believe that the instances of bytes silently
succeeding *some* of the time refers to specific bugs in specific
APIs, either intentional because of misguided compatibility desires,
or accidental in the haste of trying to convert the entire stdlib to
Python 3 in a finite time.

> Is it too late to make that tradeoff? ?Probably. ?Certainly it's not
> practical to *implement* outside the language core, and removing string
> methods would fux0r anybody whose currently-ported code relies on bytes
> objects having string-like methods.
>
>


-- 
--Guido van Rossum (python.org/~guido)

From pje at telecommunity.com  Mon Jun 21 19:29:55 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 13:29:55 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C1F9833.2080905@voidspace.org.uk>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
	<20100621164650.16A093A414B@sparrow.telecommunity.com>
	<4C1F9833.2080905@voidspace.org.uk>
Message-ID: <20100621172957.EB55C3A404D@sparrow.telecommunity.com>

At 05:49 PM 6/21/2010 +0100, Michael Foord wrote:
>Why is your proposed bstr wrapper not practical to implement outside 
>the core and use in your own libraries and frameworks?

__contains__ doesn't have a converse operation, so you can't code a 
type that works around this (Python 3.1 shown):

 >>> from os.path import join
 >>> join(b'x','y')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "c:\Python31\lib\ntpath.py", line 161, in join
     if b[:1] in seps:
TypeError: Type str doesn't support the buffer API
 >>> join('y',b'x')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "c:\Python31\lib\ntpath.py", line 161, in join
     if b[:1] in seps:
TypeError: 'in <string>' requires string as left operand, not bytes

IOW, only one of these two cases can be worked around by using a bstr 
(or ebytes) that doesn't have support from the core string type.

I'm not sure if the "in" operator is the only case where implementing 
such a type would fail, but it's the most obvious one.  String 
formatting, of both the % and .format() varieties is 
another.  (__rmod__ doesn't help if your bytes object is one of 
several data items in a tuple or dict -- the common case for % formatting.)


From tjreedy at udel.edu  Mon Jun 21 19:36:38 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 13:36:38 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621114307.48735698@heresy>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
Message-ID: <hvo7v8$qip$1@dough.gmane.org>

On 6/21/2010 11:43 AM, Barry Warsaw wrote:

> This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz
> for it.
>
> Would it make sense to have "encoding-carrying" bytes and str types?

On 2009-11-5 I posted 'Add encoding attribute to bytes' to python-ideas. 
It was shot down at the time.

Terry Jan Reedy


From tjreedy at udel.edu  Mon Jun 21 19:45:20 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 13:45:20 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
Message-ID: <hvo8fh$shu$1@dough.gmane.org>

On 6/21/2010 8:51 AM, Nick Coghlan wrote:

>
> I don't know that the "all is well" camp actually exists. The camp
> that I do see existing is the one that says "without a bug report,
> inconsistencies in the standard library's unicode handling won't get
> fixed".
>
> The issues picked up by the regression test suite have already been
> dealt with, but that suite is unfortunately far from comprehensive.
> Just like a lot of Python code that is out there, the standard library
> isn't immune to the poor coding practices that were permitted by the
> blurry lines between text and octet streams in 2.x.
>
> It may be that there are places where we need to rewrite standard
> library algorithms to be bytes/str neutral (e.g. by using length one
> slices instead of indexing). It may be that there are more APIs that
> need to grow "encoding" keyword arguments that they then pass on to
> the functions they call or use to convert str arguments to bytes (or
> vice-versa). But without people trying to port affected libraries and
> reporting bugs when they find issues, the situation isn't going to
> improve.
>
> Now, if these bugs are already being reported against 3.1 and just
> aren't getting fixed, that's a completely different story...

Some of the above have been, over a year ago. See, for instance,
http://bugs.python.org/issue5468
I am getting the impression that the people who use the web modules 
tend, like me, to not have the tools to write and test patches . So they 
can squeak but not grease.

Terry Jan Reedy


From pje at telecommunity.com  Mon Jun 21 19:46:56 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 13:46:56 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621165611.GW5787@unaka.lan>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
Message-ID: <20100621174659.D65403A404D@sparrow.telecommunity.com>

At 12:56 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
>One comment here -- you can also have uri's that aren't decodable into their
>true textual meaning using a single encoding.
>
>Apache will happily serve out uris that have utf-8, shift-jis, and euc-jp
>components inside of their path but the textual representation that 
>was intended
>will be garbled (or be represented by escaped byte sequences).  For that
>matter, apache will serve requests that have no true textual representation
>as it is working on the byte level rather than the character level.
>
>So a complete solution really should allow the programmer to pass in uris as
>bytes when the programmer knows that they need it.

ebytes(somebytes, 'garbage'), perhaps, which would be like ascii, but 
where combining with non-garbage would results in another 'garbage' ebytes?


From janssen at parc.com  Mon Jun 21 19:56:59 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 10:56:59 PDT
Subject: [Python-Dev] red buildbots on 2.7
Message-ID: <73196.1277143019@parc.com>

Considering that we've just released 2.7rc2, there are an awful lot of
red buildbots for 2.7.  In fact, I don't remember having seen a green
buildbot for OS X and 2.7.  Shouldn't these be fixed?

On OS X Leopard, I'm seeing failures in test_py3kwarn,
test_urllib2_localnet, test_uuid.

On OS X Tiger, I'm seeing failures in test_pep277, test_py3kwarn,
test_ttk_guionly, and test_urllib2_localnet.

We don't have a buildbot running Snow Leopard, apparently.

Bill

From turnbull at sk.tsukuba.ac.jp  Mon Jun 21 19:58:22 2010
From: turnbull at sk.tsukuba.ac.jp (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 02:58:22 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621145133.7F5333A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
Message-ID: <8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:

 > Note too that this is an argument for symmetry in wrapping the
 > inputs and outputs, so that the code doesn't have to "know" what
 > it's dealing with!

and

 > After all, right now, if a stdlib function might return bytes or 
 > unicode depending on runtime conditions, I can't even hardcode an 
 > .encode() call -- it would fail if the return type is a bytes.

I'm lost.  What stdlib functions are you talking about whose return
type depends on runtime conditions, and what runtime conditions?  What
do you mean by "wrapping"?

The only times I've run into str/bytes nondeterminancy is when I've
mixed str/bytes myself, and passed them into functions that are
type-identities (str -> str, bytes -> bytes), which then appear to
give a nondeterministic result.  It's a deterministic bug, though,
always mine.<wink>

 > It's One Obvious Way that I want, but some people seem to be arguing 
 > that the One Obvious Way is to Think Carefully About It Every Time -- 
 > and that seems to violate the "Obvious" part, IMO.  ;-)

Nick alluded to the The One Obvious Way as a change in architecture.

Specifically: Decode all bytes to typed objects (str, images, audio,
structured objects) at input.  Do no manipulations on bytes ever
except decode and encode (both to text, and to special-purpose objects
such as images) in a program that does I/O.  (Obviously image
manipulation libraries etc will have to operate on bytes, but they
should have no functions that consume bytes except constructors a la
bytes.decode() for text, and no functions that produce bytes except
the output serializers that write files and the like, a la
str.encode().)  Encode back to bytes on output.

Yes, this is tedious if you live in an ASCII world, compared to using
bytes as characters.  However, it works for the rest of us, which the
old style doesn't.

As for "Think Carefully About It Every Time", that is required only in
Porting Programs That Mix Operation On Bytes With Operation On Str.
If you write programs from scratch, however, the decode-process-encode
paradigm quickly becomes second nature.

From stephen at xemacs.org  Mon Jun 21 20:08:42 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 03:08:42 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621114307.48735698@heresy>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
Message-ID: <871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>

Barry Warsaw writes:

 > Would it make sense to have "encoding-carrying" bytes and str
 > types?

Why limit that to bytes and str?  Why not have all objects carry their
serializer/deserializer around with them?

I think the answer is "no", though, because (1) it would constitute an
attractive nuisance (the default would be abused, it would work fine
in Kansas, and all hell would break loose in Kagoshima, simply
delaying the pain and/or passing it on to third parties), and (2) you
really want this under control of higher level objects that have
access to some knowledge of the environment, rather than the lowest
level.

From pje at telecommunity.com  Mon Jun 21 20:17:47 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 14:17:47 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.c
 om>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
	<20100621164650.16A093A414B@sparrow.telecommunity.com>
	<AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>
Message-ID: <20100621181750.267933A404D@sparrow.telecommunity.com>

At 10:29 AM 6/21/2010 -0700, Guido van Rossum wrote:
>Perhaps there are more situations where a polymorphic API would be
>helpful. Such APIs are not always so easy to implement, because they
>have to be careful with literals or other constants (and even more so
>mutable state) used internally -- but it can be done, and there are
>plenty of examples in the stdlib.

What if we could use the time machine to make the APIs that *were* 
polymorphic, regain their previously-polymorphic status, without 
needing to actually *change* any of the code of those functions?

That's what Barry's ebytes proposal would do, with appropriate 
coercion rules.  Passing ebytes into such a function would yield back 
ebytes, even if the function used strings internally, as long as 
those strings could be encoded back to bytes using the ebytes' 
encoding.  (Which would normally be the case, since stdlib constants 
are almost always ASCII, and the main use cases for ebytes would 
involve ascii-extended encodings.)


>I'm stil unclear on exactly what bstr is supposed to be, but it sounds
>a bit like one of the rejected proposals for having a single
>(Unicode-capable) str type that is implemented using different width
>encodings (Latin-1, UCS-2, UCS-4) underneath.

Not quite - as modified by Barry's proposal (which I like better than 
mine) it'd be an object that just combines bytes with an attribute 
indicating the underlying encoding.  When it interacts with strings, 
the strings are *encoded* to bytes, rather than upgrading the bytes to text.

This is actually a big advantage for error-detection in any 
application where you're working with data that *must* be encodable 
in a specific encoding for output, as it allows you to catch errors 
much *earlier* than you would if you only did the encoding at your 
output boundary.

Anyway, this would not be the normal bytes type or string type; it's 
"bytes with an encoding".  It's also more general than Unicode, in 
the sense that it allows you to work with character sets that don't 
really *have* a proper Unicode mapping.

One issue I remember from my "enterprise" days is some of the 
Asian-language developers at NTT/Verio explaining to me that unicode 
doesn't actually solve certain issues -- that there are use cases 
where you really *do* need "bytes plus encoding" in order to properly 
express something.  Unfortunately, I never quite wrapped my head 
around the idea, I just remember it had something to do with the fact 
that Unicode has single character codes that mean different things in 
different languages, such that you were actually losing information 
by converting to unicode, or something like that.  (Or maybe the 
characters were expressed differently in certain encodings according 
to what language they came from, so you couldn't roundtrip them 
through unicode without losing information.  I think that's probably 
was what it was; maybe somebody here can chime in more on that point.)

Anyway, a type like this would need to have at least a bit of support 
from the core language, because the str type would need to be able to 
handle at least the __contains__ and %/.format() coercion cases, 
since these functions don't have __r*__ equivalents that a 
user-implemented type could provide...  and strings don't have 
anything like a '__coerce__' either.

If sufficient hooks existed, then an ebytes could be implemented 
outside the stdlib, and still used within it.


From benjamin at python.org  Mon Jun 21 20:23:57 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 21 Jun 2010 13:23:57 -0500
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <73196.1277143019@parc.com>
References: <73196.1277143019@parc.com>
Message-ID: <AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>

2010/6/21 Bill Janssen <janssen at parc.com>:
> Considering that we've just released 2.7rc2, there are an awful lot of
> red buildbots for 2.7. ?In fact, I don't remember having seen a green
> buildbot for OS X and 2.7. ?Shouldn't these be fixed?

It seems most of them are off line and there last run was just a failure.

>
> On OS X Leopard, I'm seeing failures in test_py3kwarn,
> test_urllib2_localnet, test_uuid.
>
> On OS X Tiger, I'm seeing failures in test_pep277, test_py3kwarn,
> test_ttk_guionly, and test_urllib2_localnet.

File bug reports.

>
> We don't have a buildbot running Snow Leopard, apparently.


-- 
Regards,
Benjamin

From stephen at xemacs.org  Mon Jun 21 20:20:43 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 03:20:43 +0900
Subject: [Python-Dev] [OT] carping about irritating people (was: bytes
	/	unicode)
In-Reply-To: <871vc0plt6.fsf_-_@benfinney.id.au>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<871vc0plt6.fsf_-_@benfinney.id.au>
Message-ID: <87wrts2qo4.fsf@uwakimon.sk.tsukuba.ac.jp>

Ben Finney writes:
 > "Stephen J. Turnbull" <stephen at xemacs.org> writes:
 > 
 > > your base URL is gonna be b'mailto:stephen at xemacs.org', but the
 > > natural thing the UI will want to do is
 > >
 > >     formurl = baseurl + '?subject=??????????'
 > 
 > Incidentally, which irritating person was the topic of this
 > Japanese-language message to you?

(Kudos to Nick.)  "Urusai" is also used to refer to the finicky.  So,
the RFC-toting pedant.  Ie, me.

 > (The subject in Stephen's example message translates roughly as
 > "(unspecified third person)

Not quite.  The subject of the copula, if omitted, is entirely
context-dependent.


From pje at telecommunity.com  Mon Jun 21 20:24:27 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 14:24:27 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvo7v8$qip$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <hvo7v8$qip$1@dough.gmane.org>
Message-ID: <20100621182430.6213D3A404D@sparrow.telecommunity.com>

At 01:36 PM 6/21/2010 -0400, Terry Reedy wrote:
>On 6/21/2010 11:43 AM, Barry Warsaw wrote:
>
>>This is probably a stupid idea, and if so I'll plead Monday morning mindfuzz
>>for it.
>>
>>Would it make sense to have "encoding-carrying" bytes and str types?
>
>On 2009-11-5 I posted 'Add encoding attribute to bytes' to 
>python-ideas. It was shot down at the time.

AFAICT, that's mainly for lack of apparent use cases, and also for 
confusion.  Here, the use case (restoring the polymorphy of stdlib 
APIs) is pretty clear.

However, if we had the string equivalent of a coercion protocol (that 
core strings and bytes would co-operate with), then it would enable 
people to write their own versions of either your idea or Barry's 
idea (or other things altogether), and still get the stdlib to play along.

Personally, I think ebytes() would do the trick and it'd be nice to 
see it in stdlib, but gaining a string coercion protocol instead 
might not be a bad tradeoff.  ;-)


From solipsis at pitrou.net  Mon Jun 21 20:37:56 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 21 Jun 2010 20:37:56 +0200
Subject: [Python-Dev] red buildbots on 2.7
References: <73196.1277143019@parc.com>
Message-ID: <20100621203756.2f99757f@pitrou.net>

On Mon, 21 Jun 2010 10:56:59 PDT
Bill Janssen <janssen at parc.com> wrote:
> Considering that we've just released 2.7rc2, there are an awful lot of
> red buildbots for 2.7.  In fact, I don't remember having seen a green
> buildbot for OS X and 2.7.  Shouldn't these be fixed?
> 
> On OS X Leopard, I'm seeing failures in test_py3kwarn,
> test_urllib2_localnet, test_uuid.
> 
> On OS X Tiger, I'm seeing failures in test_pep277, test_py3kwarn,
> test_ttk_guionly, and test_urllib2_localnet.

I'm afraid they can only be fixed by whoever is competent on OS X
issues. If you want to tackle them, you're more than welcome.

There also seem to be a couple of failures left with test_gdb...

Regards

Antoine.


From p.f.moore at gmail.com  Mon Jun 21 20:39:59 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Jun 2010 19:39:59 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <73196.1277143019@parc.com>
References: <73196.1277143019@parc.com>
Message-ID: <AANLkTinHh0pBEX27beeI6fwtPdGoFxfK8MHzumSDP3A_@mail.gmail.com>

On 21 June 2010 18:56, Bill Janssen <janssen at parc.com> wrote:
> Considering that we've just released 2.7rc2, there are an awful lot of
> red buildbots for 2.7. ?In fact, I don't remember having seen a green
> buildbot for OS X and 2.7. ?Shouldn't these be fixed?

Ack! My buildbot has looked fine, but on closer inspection, it was the
same build that's been running (more accurately, stuck in a test) for
5 days :-(

The main buildslave page looked fine - except for the dates, which I
didn't spot.

Thanks for the alert. I've killed the stuck test and should see some
runs going through now. Shame, really, I was getting used to seeing a
nice page of all green results...

Paul.

From pje at telecommunity.com  Mon Jun 21 20:46:57 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 14:46:57 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100621184700.BAD7F3A404D@sparrow.telecommunity.com>

At 02:58 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
>Nick alluded to the The One Obvious Way as a change in architecture.
>
>Specifically: Decode all bytes to typed objects (str, images, audio,
>structured objects) at input.  Do no manipulations on bytes ever
>except decode and encode (both to text, and to special-purpose objects
>such as images) in a program that does I/O.

This ignores the existence of use cases where what you have is text 
that can't be properly encoded in unicode.  I know, it's a hard thing 
to wrap one's head around, since on the surface it sounds like 
unicode is the programmer's savior.  Unfortunately, real-world text 
data exists which cannot be safely roundtripped to unicode, and must 
be handled in "bytes with encoding" form for certain operations.

I personally do not have to deal with this *particular* use case any 
more -- I haven't been at NTT/Verio for six years now.  But I do know 
it exists for e.g. Asian language email handling, which is where I 
first encountered it.  At the time (this *may* have changed), many 
popular email clients did not actually support unicode, so you 
couldn't necessarily just send off an email in UTF-8.  It drove us 
nuts on the project where this was involved (an i18n of an existing 
Python app), and I think we had to compromise a bit in some fashion 
(because we couldn't really avoid unicode roundtripping due to 
database issues), but the use case does actually exist.

My current needs are simpler, thank goodness.  ;-)  However, they 
*do* involve situations where I'm dealing with *other* 
encoding-restricted legacy systems, such as software for interfacing 
with the US Postal Service that only works with a restricted subset 
of latin1, while receiving mangled ASCII from an ecommerce provider, 
and storing things in what's effectively a latin-1 database.  Being 
able to easily assert what kind of bytes I've got would actually let 
me catch errors sooner, *if* those assertions were being checked when 
different kinds of strings or bytes were being combined.  i.e., at 
coercion time).


>Yes, this is tedious if you live in an ASCII world, compared to using
>bytes as characters.  However, it works for the rest of us, which the
>old style doesn't.

I'm not trying to go back to the old style -- ideally, I want 
something that would actually improve on the "it's not really 
unicode" use cases above if it were available in 2.x.

I don't want to be "encoding agnostic" or "encoding implicit", -- I 
want to make it possible to be even *more* explicit and restrictive 
than it is currently possible to be in either 2.x OR 3.x.  It's just 
that 3.x affords greater opportunity for doing this, and is an ideal 
place to make the switch -- i.e., at a point where you now have to 
get explicit about your encodings, anyway!


>As for "Think Carefully About It Every Time", that is required only in
>Porting Programs That Mix Operation On Bytes With Operation On Str.
>If you write programs from scratch, however, the decode-process-encode
>paradigm quickly becomes second nature.

Which works if and only if your outputs are truly unicode-able.  If 
you work with legacy systems (e.g. those Asian email clients and US 
postal software), you are really working with a *character set*, not 
unicode, and so putting your data in unicode form is actually *wrong* 
-- an expedient lie.

Heresy, I know, but there you go.  ;-)


From robertc at robertcollins.net  Mon Jun 21 20:59:26 2010
From: robertc at robertcollins.net (Robert Collins)
Date: Tue, 22 Jun 2010 06:59:26 +1200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTinQ_d_vaHBw5IKUYY9qgjqOfFy4XCzC0DYztr9n@mail.gmail.com>

2010/6/21 Stephen J. Turnbull <stephen at xemacs.org>:
> Robert Collins writes:
>
> ?> Also, url's are bytestrings - by definition;
>
> Eh? ?RFC 3896 explicitly says

?Definitions of Managed Objects for the DS3/E3 Interface Type

Perhaps you mean 3986 ? :)

> ? ?A URI is an identifier consisting of a sequence of characters
> ? ?matching the syntax rule named <URI> in Section 3.
>
> (where the phrase "sequence of characters" appears in all ancestors I
> found back to RFC 1738), and

Sure, ok, let me unpack what I meant just a little. An abstract URI is
neither unicode nor bytes per se - see section 1.2.1 " A URI is a
sequence of characters from a very limited set: the letters of the
basic Latin alphabet, digits, and a few special characters. "

URI interpretation is fairly strictly separated between producers and
consumers. A consumer can manipulate a url with other url fragments -
e.g. doing urljoin. But it needs to keep the url as a url and not try
to decode it to a unicode representation.

The producer of the url however, can decode via whatever heuristics it
wants - because it defines the encoding used to go from unicode to URL
encoding.

As an example, if I give the uri "http://server/%c3%83", rendering
that as http://server/? is able to lead to transcription errors and
reinterpretation problems unless you know - out of band - that the
server is using utf8 to encode. Conversely if someone enters in
http://server/? in their browser window, choosing utf8 or their local
encoding is quite arbitrary and able to not match how the server would
represent that resource.

Beyond that, producers can do odd things - like when there are a
series of servers stacked and forwarding requests amongst themselves -
where they generate different parts of the same URL using different
encodings.

> ? ?2. ?Characters
>
> ? ?The URI syntax provides a method of encoding data, presumably for
> ? ?the sake of identifying a resource, as a sequence of characters.
> ? ?The URI characters are, in turn, frequently encoded as octets for
> ? ?transport or presentation. ?This specification does not mandate any
> ? ?particular character encoding for mapping between URI characters
> ? ?and the octets used to store or transmit those characters. ?When a
> ? ?URI appears in a protocol element, the character encoding is
> ? ?defined by that protocol; without such a definition, a URI is
> ? ?assumed to be in the same character encoding as the surrounding
> ? ?text.

Thats true, but its been taken out of context; the set of characters
permitted in a URL is a strict subset of characters found in  ASCII;
there is a BNF that defines it and it is quite precise. While it
doesn't define a set of octets, it also doesn't define support for
unicode characters - individual schemes need to define the mapping
used between characters define as safe and those that get percent
encoded. E.g. unicode (abstract) -> utf8 -> percent encoded.

See also the section on comparing URL's - Unicode isn't at all relevant.

> ?> if the standard library has made them unicode objects in 3, I
> ?> expect a lot of pain in the webserver space.
>
> Yup. ?But pain is inevitable if people are treating URIs (whether URLs
> or otherwise) as octet sequences. ?Then your base URL is gonna be
> b'mailto:stephen at xemacs.org', but the natural thing the UI will want
> to do is
>
> ? ?formurl = baseurl + '?subject=??????????'
>
> IMO, the UI is right. ?"Something" like the above "ought" to work.

I wish it would. The problem is not in Python here though - and
casually handwaving will exacerbate it, not fix it. Modelling URL's as
string like things is great from a convenience perspective, but, like
file paths, they are much more complex difficult.

For your particular case, subject contains characters outside the URL
specification, so someone needs to choose an encoding to get them into
a sequence-of-bytes-that-can-be-percent-escaped.

Section 2.5, identifying data goes into this to some degree. Note a
trap - the last paragraph says 'when a *NEW* URI scheme...' (emphasis
mine). Existing schemes do not mandate UTF8, which is why the
producer/consumer split matters. I spent a few minutes looking, but
its lost in the minutiae somewhere - HTTP does not specify UTF8
(though I wish it would) for its URI's, and std66 is the generic
definition and rules for new URI schemes, preserving intact the
mistake of HTTP.

> So the function that actually handles composing the URL should take a
> string (ie, unicode), and do all escaping. ?The UI code should not
> need to know about escaping. ?If nothing escapes except the function
> that puts the URL in composed form, and that function always escapes,
> life is easy.

Arg. The problem is very similar to the file system problem:
 - We get given a sequence of bytes
 - we have some rules that will let us manipulate the sequence to get
hostnames, query parameters and so forth
 - and others to let use walk a directory structure
 - and no guarantee that any of the data is in any particular encoding
other than 'URL'.

In terms of sequence datatypes then, we can consider a few:
 - bytes
 - unicode
 - list-of-numbers
 - ...

unicode is  a problem because the system we're talking to is defined
to be a superset of unicode. People can shove stuff that fits into the
unused unicode plane, and its OK by the URL standard (for all that it
would be ugly). Having a part-unicode, part-bytes representation would
be pretty ugly IMO; certainly decoding only part of the URL would be
prone to the sorts of issues Python 2 had with str/unicode.

lists of numbers are really awkward to manipulate.

bytes doesn't suffer the unicode problem, it can represent everything
we receive, but it doesn't offer any particular support for getting a
unicode string *when one is available*.

> Of course, in real life it's not that easy. ?But it's possible to make
> things unnecessarily hard for the users of your URI API(s), and one
> way to do that is to make URIs into "just bytes" (and "just unicode"
> is probably nearly as bad, except that at least you know it's not
> ready for the wire).

If Unicode was relevant to HTTP, I'd agree, but its not; we should put
fragile heuristics at the outer layer of the API and work as robustly
and mechanically as possible at the core. Where we need to guess, we
need worker functions that won't guess at all - for the sanity of folk
writing servers and protocol implementations.

-Rob

From janssen at parc.com  Mon Jun 21 21:13:05 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 12:13:05 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
Message-ID: <75635.1277147585@parc.com>

Benjamin Peterson <benjamin at python.org> wrote:

> 2010/6/21 Bill Janssen <janssen at parc.com>:
> > Considering that we've just released 2.7rc2, there are an awful lot of
> > red buildbots for 2.7. ?In fact, I don't remember having seen a green
> > buildbot for OS X and 2.7. ?Shouldn't these be fixed?
> 
> It seems most of them are off line and there last run was just a failure.

No, the three OS X buildbots are all online and reporting failures.  As
far as I can remember, they haven't been green for weeks.

They are at the end of the buildbot list, so off-screen if you are using
a normal browser.  You have to scroll to see them.

> > On OS X Leopard, I'm seeing failures in test_py3kwarn,
> > test_urllib2_localnet, test_uuid.
> >
> > On OS X Tiger, I'm seeing failures in test_pep277, test_py3kwarn,
> > test_ttk_guionly, and test_urllib2_localnet.

Um -- saying what, the buildbots are red?  Shouldn't having green
buildbots be a part of the release process?  In fact, it is -- but none
of the OS X buildbots are part of the "stable" set.  Why is that?

Bill

From pje at telecommunity.com  Mon Jun 21 21:14:29 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 15:14:29 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
	<871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100621191432.710993A404D@sparrow.telecommunity.com>

At 03:08 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
>Barry Warsaw writes:
>
>  > Would it make sense to have "encoding-carrying" bytes and str
>  > types?
>
>
>I think the answer is "no", though, because (1) it would constitute an
>attractive nuisance (the default would be abused, it would work fine
>in Kansas, and all hell would break loose in Kagoshima, simply
>delaying the pain and/or passing it on to third parties),

You have the proposal exactly backwards, actually.

In Kagoshima, you'd use pass in an ebytes with your encoding to a 
stdlib API, and *get back an ebytes with the right encoding*, rather 
than an (incorrect and useless) unicode object which has lost data you need.


>Why limit that to bytes and str?  Why not have all objects carry their
>serializer/deserializer around with them?

Because it's not a serialization or deserialization.  Your conceptual 
framework here implies that unicode objects are the real thing, and 
that bytes are "just" a way of transporting unicode around.

But this is not the case at all, for use cases where "no, really, you 
*have to* work with bytes-encoded text streams".  The mere release of 
Python 3.x will not cause all the world's applications, libraries, 
and protocols to suddenly work with unicode, where they did not before.

Being explicit about the encoding of the bytes you're flinging around 
is actually an *increase* in specificity, explicitness, robustness, 
and error-checking ability over the status quo for either 2.x *or* 
3.x...  *and* it improves these qualities for essentially *all* 
string-handling code, without requiring that code to be rewritten to do so.

It's like getting to use the time machine, really.


>and (2) you
>really want this under control of higher level objects that have
>access to some knowledge of the environment, rather than the lowest
>level.

This proposal actually has such a higher-level object: an 
ebytes.  And it passes that information *through* the lowest level, 
in such a way as to permit the stringlike operations to be fully 
polymorphic, without the information being lost inside somebody else's API.


From barry at python.org  Mon Jun 21 21:22:38 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 15:22:38 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <73196.1277143019@parc.com>
References: <73196.1277143019@parc.com>
Message-ID: <71F08437-B687-4AC2-8FC2-856BE0DE50FA@python.org>

On Jun 21, 2010, at 1:56 PM, Bill Janssen wrote:

> Considering that we've just released 2.7rc2, there are an awful lot of
> red buildbots for 2.7.  In fact, I don't remember having seen a green
> buildbot for OS X and 2.7.  Shouldn't these be fixed?
> 
> On OS X Leopard, I'm seeing failures in test_py3kwarn,
> test_urllib2_localnet, test_uuid.
> 
> On OS X Tiger, I'm seeing failures in test_pep277, test_py3kwarn,
> test_ttk_guionly, and test_urllib2_localnet.
> 
> We don't have a buildbot running Snow Leopard, apparently.

On my OS X 10.6.4 box, only test_py3kwarn and test_urllib2_localnet fail.

-Barry


From solipsis at pitrou.net  Mon Jun 21 21:29:04 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 21 Jun 2010 21:29:04 +0200
Subject: [Python-Dev] red buildbots on 2.7
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com>
Message-ID: <20100621212904.7bec83f6@pitrou.net>

On Mon, 21 Jun 2010 12:13:05 PDT
Bill Janssen <janssen at parc.com> wrote:
> 
> > > On OS X Leopard, I'm seeing failures in test_py3kwarn,
> > > test_urllib2_localnet, test_uuid.
> > >
> > > On OS X Tiger, I'm seeing failures in test_pep277, test_py3kwarn,
> > > test_ttk_guionly, and test_urllib2_localnet.
> 
> Um -- saying what, the buildbots are red?  Shouldn't having green
> buildbots be a part of the release process?  In fact, it is -- but none
> of the OS X buildbots are part of the "stable" set.  Why is that?

Benjamin is not qualified to fix OS X bugs AFAIK (if you are, Benjamin,
then sorry for misrepresenting you :-)). Actually, neither are most of
us.

Apparently some of these buildbots belong to you. Why don't you step
up and investigate?

Thanks,

Antoine.


From a.badger at gmail.com  Mon Jun 21 21:29:52 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 15:29:52 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621172413.578853A404D@sparrow.telecommunity.com>
References: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
Message-ID: <20100621192952.GZ5787@unaka.lan>

On Mon, Jun 21, 2010 at 01:24:10PM -0400, P.J. Eby wrote:
> At 12:34 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> >What do you think of making the encoding attribute a mandatory part of
> >creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).
> 
> As long as the coercion rules force str+ebytes (or str % ebytes,
> ebytes % str, etc.) to result in another ebytes (and fail if the str
> can't be encoded in the ebytes' encoding), I'm personally fine with
> it, although I really like the idea of tacking the encoding to bytes
> objects in the first place.
> 
I wouldn't like this.  It brings us back to the python2 problem where
sometimes you pass an ebyte into a function and it works and other times you
pass an ebyte into the function and it issues a traceback.  The coercion
must end up with a str and no traceback (this assumes that we've checked
that the ebyte and the encoding "match" when we create the ebyte).

If you want bytes out the other end, you should either have a different
function or explicitly transform the output from str to bytes.

So, what's the advantage of using ebytes instead of bytes?

* It keeps together the text and encoding information when you're taking
  bytes in and want to give bytes back under the same encoding.
* It takes some of the boilerplate that people are supposed to do (checking
  that bytes are legal in a specific encoding) and writes it into the
  initialization of the object.  That forces you to think about the issue
  at two points in the code:  when converting into ebytes and when
  converting out to bytes.  For data that's going to be used with both
  str and bytes, this is the accepted best practice.  (For exceptions, the
  byte type remains which you can do conversion on when you want to).

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/af563fdc/attachment.pgp>

From benjamin at python.org  Mon Jun 21 21:30:15 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Mon, 21 Jun 2010 14:30:15 -0500
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <75635.1277147585@parc.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com>
Message-ID: <AANLkTinKcYtbrt9V5i18cT5mBD5A-rC6QWyY10TQ7cbi@mail.gmail.com>

2010/6/21 Bill Janssen <janssen at parc.com>:
> They are at the end of the buildbot list, so off-screen if you are using
> a normal browser. ?You have to scroll to see them.

But not on the "stable" view and that's the only one I look at.


-- 
Regards,
Benjamin

From barry at python.org  Mon Jun 21 21:39:59 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 15:39:59 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <201006211113.06767.stephan.richter@gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>
	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>
	<201006211113.06767.stephan.richter@gmail.com>
Message-ID: <20100621153959.01fee007@heresy>

On Jun 21, 2010, at 11:13 AM, Stephan Richter wrote:

>I really just want to be able to go to PyPI, Click on "Browse packages" and 
>then select "Python 3" (it can currently be accomplished by clicking "Python" 
>and then  "3"). Of course, package developers need to be encouraged to add 
>these Trove classifiers so that the listings are as complete as possible.

Trove classifiers are not particularly user friendly.  I wonder if we can help
with a (partially) automated or guided tool to help?  Maybe something on the
web page for packages w/o classifications, kind of like a Linked-in progress
meter...

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/0183fca0/attachment-0001.pgp>

From fuzzyman at voidspace.org.uk  Mon Jun 21 21:45:08 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 21 Jun 2010 20:45:08 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTinKcYtbrt9V5i18cT5mBD5A-rC6QWyY10TQ7cbi@mail.gmail.com>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>
	<AANLkTinKcYtbrt9V5i18cT5mBD5A-rC6QWyY10TQ7cbi@mail.gmail.com>
Message-ID: <4C1FC144.70600@voidspace.org.uk>

On 21/06/2010 20:30, Benjamin Peterson wrote:
> 2010/6/21 Bill Janssen<janssen at parc.com>:
>    
>> They are at the end of the buildbot list, so off-screen if you are using
>> a normal browser.  You have to scroll to see them.
>>      
> But not on the "stable" view and that's the only one I look at.
>
>    

What are the requirements for moving the OS X buildbots into the stable 
view? Are the builders themselves stable enough? (If the requirement is 
that the buildbots be green then it is something of a catch-22.)

All the best,

Michael

-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From barry at python.org  Mon Jun 21 21:55:50 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 15:55:50 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621163404.GV5787@unaka.lan>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
Message-ID: <20100621155550.643d27b8@heresy>

On Jun 21, 2010, at 12:34 PM, Toshio Kuratomi wrote:

>I like the idea of having encoding information carried with the data.
>I don't think that an ebytes type that can *optionally* have an encoding
>attribute makes the situation less confusing, though.

Agreed.  I think the attribute should always be there, but there probably
needs to be a magic value (perhaps None) that indicates and unknown, manual,
garbage, error, broken encoding.

Examples: you read bytes off a socket and don't know what the encoding is; you
concatenate two ebytes that have incompatible encodings.

>To me the biggest
>problem with python-2.x's unicode/bytes handling was not that it threw
>exceptions but that it didn't always throw exceptions.  You might test this
>in python2::
>    t = u'cafe'
>    function(t)
>
>And say, ah my code works.  Then a user gives it this::
>    t = u'caf?'
>    function(t)
>
>And get a unicode error because the function only works with unicode in the
>ascii range.

That's an excellent point.

>ebytes seems to have the same pitfall where the code path exercised by your
>tests could work with::
>    eb = ebytes(b)
>    eb.encoding = 'euc-jp'
>    function(eb)
>
>but the user exercises a code path that does this and fails::
>    eb = ebytes(b)
>    function(eb)
>
>What do you think of making the encoding attribute a mandatory part of
>creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).

If ebytes is a separate type, then definitely +1.  If 'ebytes is bytes' then
I'd probably want to default the second argument to the magical "i-don't-know'
marker.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/344b2054/attachment.pgp>

From steve at holdenweb.com  Mon Jun 21 21:55:55 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 22 Jun 2010 04:55:55 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilGK08cIOB2uDe4Go92CTEO0j1eQ4TpwVltC61t@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>	<201006211113.06767.stephan.richter@gmail.com>	<AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>
	<AANLkTilGK08cIOB2uDe4Go92CTEO0j1eQ4TpwVltC61t@mail.gmail.com>
Message-ID: <4C1FC3CB.5080604@holdenweb.com>

Laurens Van Houtven wrote:
> On Mon, Jun 21, 2010 at 5:28 PM, Toshio Kuratomi <a.badger at gmail.com> wrote:
>> <nod> Fedora 14 is about the same.  A nice to have thing that goes along
>> with these would be a table that has packages ported to python3 and which
>> distributions have the python3 version of the package.
> 
> Yeah, this is exactly why I'd prefer to not have to maintain a
> specific list. Big distros are making Python 3.x available, it's not
> the default interpreter yet anywhere (AFAIK?), but that's going to
> happen in the next few releases of said distributions.
> 
> On Mon, Jun 21, 2010 at 5:31 PM, Arc Riley <arcriley at gmail.com> wrote:
>> Personally, I'd like to celebrate the upcoming Python 3.2 release (which
>> will hopefully include 3to2) with moving all packages which do not have the
>> 'Programming Language :: Python :: 3' classifier to a "Legacy" section of
>> PyPI and offer only Python 3 packages otherwise.  Of course put a banner at
>> the top clearly explaining that Python 2 packages can be found in the Legacy
>> section.
>>
>> Radical, I know, but at some point we really need to make this move.
> 
> I agree we have to make it at some point but I feel this is way, way too early.
> 
> thanks for your continued input,
> Laurens

But it's never too early to plan for something you know to be
inevitable. More planning might have helped earlier on. I don't think
it's likely to hurt now.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Mon Jun 21 21:55:55 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 22 Jun 2010 04:55:55 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilGK08cIOB2uDe4Go92CTEO0j1eQ4TpwVltC61t@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>	<201006211113.06767.stephan.richter@gmail.com>	<AANLkTiklu4xr_f3A8JL2mmTIdAmc9RBynFBWkuptavAd@mail.gmail.com>
	<AANLkTilGK08cIOB2uDe4Go92CTEO0j1eQ4TpwVltC61t@mail.gmail.com>
Message-ID: <4C1FC3CB.5080604@holdenweb.com>

Laurens Van Houtven wrote:
> On Mon, Jun 21, 2010 at 5:28 PM, Toshio Kuratomi <a.badger at gmail.com> wrote:
>> <nod> Fedora 14 is about the same.  A nice to have thing that goes along
>> with these would be a table that has packages ported to python3 and which
>> distributions have the python3 version of the package.
> 
> Yeah, this is exactly why I'd prefer to not have to maintain a
> specific list. Big distros are making Python 3.x available, it's not
> the default interpreter yet anywhere (AFAIK?), but that's going to
> happen in the next few releases of said distributions.
> 
> On Mon, Jun 21, 2010 at 5:31 PM, Arc Riley <arcriley at gmail.com> wrote:
>> Personally, I'd like to celebrate the upcoming Python 3.2 release (which
>> will hopefully include 3to2) with moving all packages which do not have the
>> 'Programming Language :: Python :: 3' classifier to a "Legacy" section of
>> PyPI and offer only Python 3 packages otherwise.  Of course put a banner at
>> the top clearly explaining that Python 2 packages can be found in the Legacy
>> section.
>>
>> Radical, I know, but at some point we really need to make this move.
> 
> I agree we have to make it at some point but I feel this is way, way too early.
> 
> thanks for your continued input,
> Laurens

But it's never too early to plan for something you know to be
inevitable. More planning might have helped earlier on. I don't think
it's likely to hurt now.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From janssen at parc.com  Mon Jun 21 21:57:22 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 12:57:22 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <20100621212904.7bec83f6@pitrou.net>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
Message-ID: <77297.1277150242@parc.com>

Antoine Pitrou <solipsis at pitrou.net> wrote:

> Benjamin is not qualified to fix OS X bugs AFAIK (if you are, Benjamin,
> then sorry for misrepresenting you :-)). Actually, neither are most of
> us.

Right.  I was thinking that the release manager should however be
responsible for not releasing while there are red buildbots.  But it's
not his fault, either; there are no OS X buildbots on the "stable" list,
and that's the list PEP 101 says to look at.

The real problem here is that a major platform doesn't have a "stable"
buildbot, I think.  I've logged an issue to that effect.

> Apparently some of these buildbots belong to you. Why don't you step
> up and investigate?

The fact that I'm running some buildbots doesn't mean I have to fix the
problems that they reveal, I think.

I did look at the py3kwarn failure, and couldn't figure out the various
twisty passages of deprecation warning as further snarled by the test
package.  I think that one needs someone who's intimately familiar with
the testing framework.

Bill

From janssen at parc.com  Mon Jun 21 21:57:53 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 12:57:53 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTinKcYtbrt9V5i18cT5mBD5A-rC6QWyY10TQ7cbi@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com>
	<AANLkTinKcYtbrt9V5i18cT5mBD5A-rC6QWyY10TQ7cbi@mail.gmail.com>
Message-ID: <77310.1277150273@parc.com>

Benjamin Peterson <benjamin at python.org> wrote:

> 2010/6/21 Bill Janssen <janssen at parc.com>:
> > They are at the end of the buildbot list, so off-screen if you are using
> > a normal browser. ?You have to scroll to see them.
> 
> But not on the "stable" view and that's the only one I look at.

Right, and properly so.

Bill

From barry at python.org  Mon Jun 21 22:01:05 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 16:01:05 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
	<871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100621160105.25ae602f@heresy>

On Jun 22, 2010, at 03:08 AM, Stephen J. Turnbull wrote:

>Barry Warsaw writes:
>
> > Would it make sense to have "encoding-carrying" bytes and str
> > types?
>
>Why limit that to bytes and str?  Why not have all objects carry their
>serializer/deserializer around with them?

Only because the .encoding attribute isn't really a serializer/deserializer.
That's still bytes() and str() or the equivalent.  This is just a hint to a
specific serializer for parameters to that action.

>I think the answer is "no", though, because (1) it would constitute an
>attractive nuisance (the default would be abused, it would work fine
>in Kansas, and all hell would break loose in Kagoshima, simply
>delaying the pain and/or passing it on to third parties), and (2) you
>really want this under control of higher level objects that have
>access to some knowledge of the environment, rather than the lowest
>level.

I'm still not sure ebytes solves the problem, but it avoids one I'm most
concerned about seeing proposed.  I really really do not want to add
encoding=blah arguments to boatloads of function signatures.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/dda14ec7/attachment.pgp>

From solipsis at pitrou.net  Mon Jun 21 22:02:50 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 21 Jun 2010 22:02:50 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <77297.1277150242@parc.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
Message-ID: <1277150570.3369.1.camel@localhost.localdomain>

Le lundi 21 juin 2010 ? 12:57 -0700, Bill Janssen a ?crit :
>
> > Apparently some of these buildbots belong to you. Why don't you step
> > up and investigate?
> 
> The fact that I'm running some buildbots doesn't mean I have to fix the
> problems that they reveal, I think.

You certainly don't have to. But please don't ask others to do it for
you, *especially* if the failure can't be reproduced under anything else
than OS X, and if no useful diagnosis is available.

Regards

Antoine.


From barry at python.org  Mon Jun 21 22:04:20 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 16:04:20 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621172413.578853A404D@sparrow.telecommunity.com>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
Message-ID: <20100621160420.63037f1c@heresy>

On Jun 21, 2010, at 01:24 PM, P.J. Eby wrote:

>OTOH, one potential problem with having the encoding on the bytes object
>rather than the ebytes object is that then you can't easily take bytes from a
>socket and then say what encoding they are, without interfering with the
>sockets API (or whatever other place you get the bytes from).

Unless the default was the "I don't know" marker and you were able to set it
after you've done whatever kind of application-level calculation you needed to
do.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/dfd21157/attachment.pgp>

From steve at holdenweb.com  Mon Jun 21 21:59:53 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 22 Jun 2010 04:59:53 +0900
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvo3ln$55n$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>	<hvm6cu$gaq$1@dough.gmane.org>	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>
	<hvo3ln$55n$1@dough.gmane.org>
Message-ID: <hvogc0$tpt$2@dough.gmane.org>

Terry Reedy wrote:
> On 6/21/2010 8:33 AM, Nick Coghlan wrote:
> 
>> P.S. (We're going to have a tough decision to make somewhere along the
>> line where docs.python.org is concerned, too - when do we flick the
>> switch and make a 3.x version of the docs the default?
> 
> Easy. When 3.2 is released. When 2.7 is released, 3.2 becomes 'trunk'.
> Trunk released always take over docs.python.org. To do otherwise would
> be to say that 3.2 is not a real trunk release and not yet ready for
> real use -- a major slam.
> 
> Actually, I thought this was already discussed and decided ;-).
> 
This also gives the 2.7 release it's day in the sun before relegation to
maintenance status.

The Python 3 documents, when they become the default, should contain an
every-page link to the Python 2 documentation (though linkages may be a
problem - they could probably be done at a gross level).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From srichter at cosmos.phy.tufts.edu  Mon Jun 21 21:44:35 2010
From: srichter at cosmos.phy.tufts.edu (Stephan Richter)
Date: Mon, 21 Jun 2010 15:44:35 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <20100621153959.01fee007@heresy>
References: <20100618050712.GC20639@thorne.id.au>
	<201006211113.06767.stephan.richter@gmail.com>
	<20100621153959.01fee007@heresy>
Message-ID: <201006211544.35518.srichter@cosmos.phy.tufts.edu>

On Monday, June 21, 2010, Barry Warsaw wrote:
>   On Jun 21, 2010, at 11:13 AM, Stephan Richter wrote:
> >I really just want to be able to go to PyPI, Click on "Browse packages"
> >and  then select "Python 3" (it can currently be accomplished by clicking
> >"Python" and then  "3"). Of course, package developers need to be
> >encouraged to add these Trove classifiers so that the listings are as
> >complete as possible.
> 
> Trove classifiers are not particularly user friendly.  I wonder if we can
> help with a (partially) automated or guided tool to help?  Maybe something
> on the web page for packages w/o classifications, kind of like a Linked-in
> progress meter...

Yeah that would be good. I thought the "Score" was something like that, but it 
is not transparent enough. It would be great, if PyPI would tell me how I can 
improve my package meta-data. (The Linked-in progress meter worked for me too. 
;-)

Regards,
Stephan
-- 
Entrepreneur and Software Geek
Google me. "Zope Stephan Richter"

From pengyu.ut at gmail.com  Mon Jun 21 22:07:53 2010
From: pengyu.ut at gmail.com (Peng Yu)
Date: Mon, 21 Jun 2010 15:07:53 -0500
Subject: [Python-Dev] Adding additional level of bookmarks and section
	numbers in python pdf documents.
Message-ID: <AANLkTikMD8ZXuA50e-QA7QRmMQg0qv6YNc7M-AKmfDp6@mail.gmail.com>

Hi,

Current pdf version of python documents don't have bookmarks for
sussubsection. For example, there is no bookmark for the following
section in python_2.6.5_reference.pdf. Also the bookmarks don't have
section numbers in them. I suggest to include the section numbers.
Could these features be added in future release of python document.

3.4.1 Basic customization

-- 
Regards,
Peng

From barry at python.org  Mon Jun 21 22:09:04 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 16:09:04 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621192952.GZ5787@unaka.lan>
References: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621192952.GZ5787@unaka.lan>
Message-ID: <20100621160904.166ad082@heresy>

On Jun 21, 2010, at 03:29 PM, Toshio Kuratomi wrote:

>I wouldn't like this.  It brings us back to the python2 problem where
>sometimes you pass an ebyte into a function and it works and other times you
>pass an ebyte into the function and it issues a traceback.  The coercion
>must end up with a str and no traceback (this assumes that we've checked
>that the ebyte and the encoding "match" when we create the ebyte).

Doing this at ebyte construction time does have the nice benefit of getting
the exception early, and because the ebyte is unmutable, you could cache the
results in an attribute on the ebyte.  Well, unmutable if the .encoding is
also unmutable.  If that can change, then you'd have to re-run the cached
decoding whenever the attribute were set, and there would be a penalty paid
each time this was done.

That, plus the socket use case, does argue for a separate ebytes type.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/d21f3ec8/attachment.pgp>

From pje at telecommunity.com  Mon Jun 21 22:09:52 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 16:09:52 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621192952.GZ5787@unaka.lan>
References: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621192952.GZ5787@unaka.lan>
Message-ID: <20100621201006.5A3223A404D@sparrow.telecommunity.com>

At 03:29 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
>On Mon, Jun 21, 2010 at 01:24:10PM -0400, P.J. Eby wrote:
> > At 12:34 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> > >What do you think of making the encoding attribute a mandatory part of
> > >creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).
> >
> > As long as the coercion rules force str+ebytes (or str % ebytes,
> > ebytes % str, etc.) to result in another ebytes (and fail if the str
> > can't be encoded in the ebytes' encoding), I'm personally fine with
> > it, although I really like the idea of tacking the encoding to bytes
> > objects in the first place.
> >
>I wouldn't like this.  It brings us back to the python2 problem where
>sometimes you pass an ebyte into a function and it works and other times you
>pass an ebyte into the function and it issues a traceback.

For stdlib functions, this isn't going to happen unless your ebytes' 
encoding is not compatible with the ascii subset of unicode, or the 
stdlib function is working with dynamic data...  in which case you 
really *do* want to fail early!

I don't see this as a repeat of the 2.x situation; rather, it allows 
you to cause errors to happen much *earlier* than they would 
otherwise show up if you were using unicode for your encoded-bytes data.

For example, if your program's intent is to end up with latin-1 
output, then it would be better for an error to show up at the very 
*first* point where non-latin1 characters are mixed with your data, 
rather than only showing up at the output boundary!

However, if you promoted mixed-type operation results to unicode 
instead of ebytes, then you:

1) can't preserve data that doesn't have a 1:1 mapping to unicode, and

2) can't detect an error until your data reaches the output point in 
your application -- forcing you to defensively insert ebytes calls 
everywhere (vs. simply wrapping them around a handful of designated 
inputs), or else have to go right back to tracing down where the 
unusable data showed up in the first place.

One thing that seems like a bit of a blind spot for some folks is 
that having unicode is *not* everybody's goal.  Not because we don't 
believe unicode is generally a good thing or anything like that, but 
because we have to work with systems that flat out don't *do* 
unicode, thereby making the presence of (fully-general) unicode an 
error condition that has to be stamped out!

IOW, if you're producing output that has to go into another system 
that doesn't take unicode, it doesn't matter how 
theoretically-correct it would be for your app to process the data in 
unicode form.  In that case, unicode is not a feature: it's a bug.

And as it really *is* an error in that case, it should not pass 
silently, unless explicitly silenced.


>So, what's the advantage of using ebytes instead of bytes?
>
>* It keeps together the text and encoding information when you're taking
>   bytes in and want to give bytes back under the same encoding.
>* It takes some of the boilerplate that people are supposed to do (checking
>   that bytes are legal in a specific encoding) and writes it into the
>   initialization of the object.  That forces you to think about the issue
>   at two points in the code:  when converting into ebytes and when
>   converting out to bytes.  For data that's going to be used with both
>   str and bytes, this is the accepted best practice.  (For exceptions, the
>   byte type remains which you can do conversion on when you want to).

Hm.  For the output case, I suppose that means you might also want 
the text I/O wrappers to be able to be strict about ebytes' encoding.


From fuzzyman at voidspace.org.uk  Mon Jun 21 22:13:26 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 21 Jun 2010 21:13:26 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <1277150570.3369.1.camel@localhost.localdomain>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>
	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
Message-ID: <4C1FC7E6.5070707@voidspace.org.uk>

On 21/06/2010 21:02, Antoine Pitrou wrote:
> Le lundi 21 juin 2010 ? 12:57 -0700, Bill Janssen a ?crit :
>    
>>      
>>> Apparently some of these buildbots belong to you. Why don't you step
>>> up and investigate?
>>>        
>> The fact that I'm running some buildbots doesn't mean I have to fix the
>> problems that they reveal, I think.
>>      
> You certainly don't have to. But please don't ask others to do it for
> you, *especially* if the failure can't be reproduced under anything else
> than OS X, and if no useful diagnosis is available.
>    

If OS X is a supported and important platform for Python then fixing all 
problems that it reveals (or being willing to) should definitely not be 
a pre-requisite of providing a buildbot (which is already a service to 
the Python developer community). Fixing bugs / failures revealed by 
Bill's buildbot is not fixing them "for Bill" it is fixing them for Python.

All the best,

Michael

> Regards
>
> Antoine.
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From pje at telecommunity.com  Mon Jun 21 22:16:13 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Mon, 21 Jun 2010 16:16:13 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621160420.63037f1c@heresy>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621160420.63037f1c@heresy>
Message-ID: <20100621201616.EADEF3A404D@sparrow.telecommunity.com>

At 04:04 PM 6/21/2010 -0400, Barry Warsaw wrote:
>On Jun 21, 2010, at 01:24 PM, P.J. Eby wrote:
>
> >OTOH, one potential problem with having the encoding on the bytes object
> >rather than the ebytes object is that then you can't easily take 
> bytes from a
> >socket and then say what encoding they are, without interfering with the
> >sockets API (or whatever other place you get the bytes from).
>
>Unless the default was the "I don't know" marker and you were able to set it
>after you've done whatever kind of application-level calculation you needed to
>do.

True, but making it a separate type with a required encoding gets rid 
of the magical "I don't know" - the "I don't know" encoding is just a 
plain old bytes object.

(In principle, you could then drop *all* the stringlike methods from 
plain-old-bytes objects.  If it's really text-in-bytes you want, you 
should use an ebytes with the encoding specified.)


From barry at python.org  Mon Jun 21 22:19:17 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 16:19:17 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621171803.B35C33A414B@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
	<20100621171803.B35C33A414B@sparrow.telecommunity.com>
Message-ID: <20100621161917.13efe49a@heresy>

On Jun 21, 2010, at 01:17 PM, P.J. Eby wrote:

>I'm not really sure how much use the encoding is on a unicode object - what
>would it actually mean?
>
>Hm. I suppose it would effectively mean "this string can be represented in
>this encoding" -- which is useful, in that you could fail operations when
>combining with bytes of a different encoding.

That's basically what I was thinking.

>Hm... no, in that case you should just encode the string to the bytes'
>encoding, and let that throw an error if it fails.  So, really, there's no
>reason for a string to know its encoding.  All you need is the bytes type to
>have an encoding attribute, and when doing mixed-type operations between
>bytes and strings, coerce to *bytes of the same encoding*.

If ebytes were a separate type, and it did the encoding check at constructor
time, and the results of the decoding were cached, then I think you would not
need the equivalent of an estr type.  If you had a string and knew what it
could be encoded to, then you could just coerce it to an ebytes and use the
cached decoded value wherever you needed it.

E.g.

    >>> mystring = 'some unicode string'
    >>> myencoding = 'iso-9999-foo'
    >>> myebytes = ebytes(mystring, myencoding)
    >>> myebytes.encoding == myencoding
    True
    >>> myebytes.string == mystring
    True

So ebytes() could accept a str or bytes as its first argument.

    >>> mybytes = b'some encoded string'
    >>> myebytes = ebytes(mybytes, myencoding)
    >>> mybytes == myebytes
    True
    >>> myebytes.encoding == myencoding
    True

In the first example ebytes() encodes mystring to set the internal bytes
representation.  In the second example, ebytes() decodes the bytes to get the
.string attribute value.  In both cases, an exception is raised if the
encoding/decoding fails.

>However, if .encoding is None, then coercion would follow the same rules as
>now -- i.e., convert the bytes to unicode, assuming an ascii encoding.  (This
>would be different than setting an encoding of 'ascii', because in that case,
>it means you want cross-type operations to result in ascii bytes, rather than
>a unicode string, and to fail if the unicode part can't be encoded
>appropriately.  The 'None' setting is effectively a nod to compatibility with
>prior 3.x versions, since I assume we can't just throw out the old coercion
>behavior.)
>
>Then, a few more changes to the bytes type would round out the implementation:
>
>* Allow .decode() to not specify an encoding, unless .encoding is None
>
>* Add back in the missing string methods (e.g. .encode()), since you can transparently upgrade to a string)
>
>* Smart __str__, as shown in your proposal.

If my example above isn't nonsense, then __str__() would just return the
.string attribute.

>In short, +1.  (I wish it were possible to go back and make bytes non-strings
>and have only this ebytes or bstr or whatever type have string methods, but
>I'm pretty sure that ship has already sailed.)

Maybe it's PEP time?  No, I'm not volunteering. ;)

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/34aa881f/attachment-0001.pgp>

From barry at python.org  Mon Jun 21 22:24:47 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 16:24:47 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621201616.EADEF3A404D@sparrow.telecommunity.com>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621160420.63037f1c@heresy>
	<20100621201616.EADEF3A404D@sparrow.telecommunity.com>
Message-ID: <20100621162447.697af8da@heresy>

On Jun 21, 2010, at 04:16 PM, P.J. Eby wrote:

>At 04:04 PM 6/21/2010 -0400, Barry Warsaw wrote:
>>On Jun 21, 2010, at 01:24 PM, P.J. Eby wrote:
>>
>> >OTOH, one potential problem with having the encoding on the bytes object
>> >rather than the ebytes object is that then you can't easily take > bytes from a
>> >socket and then say what encoding they are, without interfering with the
>> >sockets API (or whatever other place you get the bytes from).
>>
>>Unless the default was the "I don't know" marker and you were able to set it
>>after you've done whatever kind of application-level calculation you needed to
>>do.
>
>True, but making it a separate type with a required encoding gets rid of the magical "I don't know" - the "I don't know" encoding is just a plain old bytes object.
>
>(In principle, you could then drop *all* the stringlike methods from plain-old-bytes objects.  If it's really text-in-bytes you want, you should use an ebytes with the encoding specified.)

Yep, agreed!
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/3fea0f9d/attachment.pgp>

From solipsis at pitrou.net  Mon Jun 21 22:25:26 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Mon, 21 Jun 2010 22:25:26 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FC7E6.5070707@voidspace.org.uk>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk>
Message-ID: <1277151926.3369.6.camel@localhost.localdomain>

Le lundi 21 juin 2010 ? 21:13 +0100, Michael Foord a ?crit :
> 
> If OS X is a supported and important platform for Python then fixing all 
> problems that it reveals (or being willing to) should definitely not be 
> a pre-requisite of providing a buildbot (which is already a service to 
> the Python developer community). Fixing bugs / failures revealed by 
> Bill's buildbot is not fixing them "for Bill" it is fixing them for Python.

I didn't say it was a prerequisite. I was merely pointing out that when
platform-specific bugs appear, people using the specific platform should
be helping if they want to actually encourage the fixing of these bugs.

OS X is only "a supported and important platform" if we have dedicated
core developers diagnosing or even fixing issues for it (like we
obviously have for Windows and Linux). Otherwise, I don't think we have
any moral obligation to support it.

Regards

Antoine.


From a.badger at gmail.com  Mon Jun 21 22:28:39 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 16:28:39 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
References: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
Message-ID: <20100621202839.GA5787@unaka.lan>

On Mon, Jun 21, 2010 at 02:46:57PM -0400, P.J. Eby wrote:
> At 02:58 AM 6/22/2010 +0900, Stephen J. Turnbull wrote:
> >Nick alluded to the The One Obvious Way as a change in architecture.
> >
> >Specifically: Decode all bytes to typed objects (str, images, audio,
> >structured objects) at input.  Do no manipulations on bytes ever
> >except decode and encode (both to text, and to special-purpose objects
> >such as images) in a program that does I/O.
> 
> This ignores the existence of use cases where what you have is text
> that can't be properly encoded in unicode.  I know, it's a hard thing
> to wrap one's head around, since on the surface it sounds like
> unicode is the programmer's savior.  Unfortunately, real-world text
> data exists which cannot be safely roundtripped to unicode, and must
> be handled in "bytes with encoding" form for certain operations.
> 
> I personally do not have to deal with this *particular* use case any
> more -- I haven't been at NTT/Verio for six years now.  But I do know
> it exists for e.g. Asian language email handling, which is where I
> first encountered it.  At the time (this *may* have changed), many
> popular email clients did not actually support unicode, so you
> couldn't necessarily just send off an email in UTF-8.  It drove us
> nuts on the project where this was involved (an i18n of an existing
> Python app), and I think we had to compromise a bit in some fashion
> (because we couldn't really avoid unicode roundtripping due to
> database issues), but the use case does actually exist.
> 
> My current needs are simpler, thank goodness.  ;-)  However, they
> *do* involve situations where I'm dealing with *other*
> encoding-restricted legacy systems, such as software for interfacing
> with the US Postal Service that only works with a restricted subset
> of latin1, while receiving mangled ASCII from an ecommerce provider,
> and storing things in what's effectively a latin-1 database.  Being
> able to easily assert what kind of bytes I've got would actually let
> me catch errors sooner, *if* those assertions were being checked when
> different kinds of strings or bytes were being combined.  i.e., at
> coercion time).
> 
While it's certainly possible that you have a grapheme that has no
corresponding unicode codepoint, it doesn't sound like this is the case
you're dealing with here.  You talk about "restricted subset of latin1"
but all of latin1's graphemes have unicode codepoints.  You also talk about
not being able to "send off an email in UTF-8" but UTF-8 is an encoding of
unicode, not unicode itself.  Similarly, the statement that some email
clients don't support unicode isn't very clear as to actual problem.  The
email client supports displaying graphemes using glyphs present on the
computer.  As long as the graphemes needed have a unicode codepoint, using
unicode inside of your application and then encoding to bytes on the way out
works fine.

Even in cases where there's no unicode codepoint for the grapheme that
you're receiving unicode gives you a way out.  It provides you a private use
area where you can map the graphemes to unused codepoints.  Your
application keeps a mapping from that codepoint to the particular byte
sequence that you want.  Then write you a codec that converts from unicode w/
these private codepoints into your particular encoding (and from bytes into
unicode).

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/16322d58/attachment.pgp>

From mal at egenix.com  Mon Jun 21 22:29:13 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 21 Jun 2010 22:29:13 +0200
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621155550.643d27b8@heresy>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>	<20100621114307.48735698@heresy>
	<20100621163404.GV5787@unaka.lan> <20100621155550.643d27b8@heresy>
Message-ID: <4C1FCB99.1090102@egenix.com>

Barry Warsaw wrote:
> On Jun 21, 2010, at 12:34 PM, Toshio Kuratomi wrote:
> 
>> I like the idea of having encoding information carried with the data.
>> I don't think that an ebytes type that can *optionally* have an encoding
>> attribute makes the situation less confusing, though.
> 
> Agreed.  I think the attribute should always be there, but there probably
> needs to be a magic value (perhaps None) that indicates and unknown, manual,
> garbage, error, broken encoding.
> 
> Examples: you read bytes off a socket and don't know what the encoding is; you
> concatenate two ebytes that have incompatible encodings.

Such extra information tends to be lost whenever you pass the
bytes data through a C level API or some other function that
doesn't know about the special nature of those objects, treating
them just like any bytes object.

It may sound nice in theory, but in practice it doesn't work out.

Besides, if you do know the encoding, you can easily carry the
data around in a Unicode str object.

The problem lies elsewhere: What to do with a piece of text for
which you don't know the encoding and how to combine that piece
of text with other pieces of text for which you do know the
encoding.

There are a few options at hand:

 * you keep working on the bytes data and only convert things
   to Unicode when needed and where the encoding is known

 * you decode the bytes data for which you don't have the encoding
   information into some special Unicode form (eg. using the
   surrogateescape error handler) and hope that when the time
   comes to encode the Unicode data back into bytes, the codec
   supports reversing the conversion

 * you manage the data as a list of Unicode str and
   bytes objects and don't even try to be clever about encodings
   of text without unknown encoding

It depends a lot on the use case, which of these options fits
best.

>> To me the biggest
>> problem with python-2.x's unicode/bytes handling was not that it threw
>> exceptions but that it didn't always throw exceptions.  You might test this
>> in python2::
>>    t = u'cafe'
>>    function(t)
>>
>> And say, ah my code works.  Then a user gives it this::
>>    t = u'caf?'
>>    function(t)
>>
>> And get a unicode error because the function only works with unicode in the
>> ascii range.
> 
> That's an excellent point.

Here's a little known fact: by changing the Python2 default
encoding to 'undefined' (yes, that's a real codec !), you can disable
all automatic string coercion in Python2.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 21 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                27 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From janssen at parc.com  Mon Jun 21 22:36:20 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 13:36:20 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <1277150570.3369.1.camel@localhost.localdomain>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
Message-ID: <94209.1277152580@parc.com>

Antoine Pitrou <solipsis at pitrou.net> wrote:

> Le lundi 21 juin 2010 ? 12:57 -0700, Bill Janssen a ?crit :
> >
> > > Apparently some of these buildbots belong to you. Why don't you step
> > > up and investigate?
> > 
> > The fact that I'm running some buildbots doesn't mean I have to fix the
> > problems that they reveal, I think.
> 
> You certainly don't have to. But please don't ask others to do it for
> you, *especially* if the failure can't be reproduced under anything else
> than OS X, and if no useful diagnosis is available.

I'm more concerned about doing it for *us*, rather than for *me*.  Yes,
an OS X machine would be required to poke at it, but I doubt I'm the
only one here with an OS X machine :-).  If I am, that's a problem, and
we as a community should do something about that.

I downloaded 2.7rc2 and built it on my Intel OS X 10.5.8 machine.  It
still fails the test_uuid test:

% make test
[...]
test_uuid
test test_uuid failed -- Traceback (most recent call last):
  File "/private/tmp/Python-2.7rc2/Lib/test/test_uuid.py", line 472, in testIssue8621
    self.assertNotEqual(parent_value, child_value)
AssertionError: '8395a08e40454895be537a180539b7fb' == '8395a08e40454895be537a180539b7fb'

[...]

However, when I run it directly:

% ./python.exe -Wd -3 -E -tt ./Lib/test/regrtest.py -v test_uuid
== CPython 2.7rc2 (r27rc2:82137, Jun 21 2010, 12:50:22) [GCC 4.0.1 (Apple Inc. build 5493)]
==   Darwin-9.8.0-i386-32bit little-endian
==   /private/tmp/Python-2.7rc2/build/test_python_58012
test_uuid
testIssue8621 (test.test_uuid.TestUUID) ... ok
test_UUID (test.test_uuid.TestUUID) ... ok
test_exceptions (test.test_uuid.TestUUID) ... ok
test_getnode (test.test_uuid.TestUUID) ... ok
test_ifconfig_getnode (test.test_uuid.TestUUID) ... ok
test_ipconfig_getnode (test.test_uuid.TestUUID) ... ok
test_netbios_getnode (test.test_uuid.TestUUID) ... ok
test_random_getnode (test.test_uuid.TestUUID) ... ok
test_unixdll_getnode (test.test_uuid.TestUUID) ... ok
test_uuid1 (test.test_uuid.TestUUID) ... ok
test_uuid3 (test.test_uuid.TestUUID) ... ok
test_uuid4 (test.test_uuid.TestUUID) ... ok
test_uuid5 (test.test_uuid.TestUUID) ... ok
test_windll_getnode (test.test_uuid.TestUUID) ... ok

----------------------------------------------------------------------
Ran 14 tests in 0.087s

OK
1 test OK.
%

So I don't know what to think.

The same thing happens with the py3kwarn test:

% ./python.exe -Wd -3 -E -tt ./Lib/test/regrtest.py -v test_py3kwarn
== CPython 2.7rc2 (r27rc2:82137, Jun 21 2010, 12:50:22) [GCC 4.0.1 (Apple Inc. build 5493)]
==   Darwin-9.8.0-i386-32bit little-endian
==   /private/tmp/Python-2.7rc2/build/test_python_58057
test_py3kwarn
test_backquote (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_buffer (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_builtin_function_or_method_comparisons (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_cell_inequality_comparisons (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_code_inequality_comparisons (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_dict_inequality_comparisons (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_file_xreadlines (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_forbidden_names (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_frame_attributes (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_hash_inheritance (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_methods_members (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_object_inequality_comparisons (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_operator (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_paren_arg_names (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_slice_methods (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_softspace (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_sort_cmp_arg (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_sys_exc_clear (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_tuple_parameter_unpacking (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_type_inequality_comparisons (test.test_py3kwarn.TestPy3KWarnings) ... ok
test_mutablestring_removal (test.test_py3kwarn.TestStdlibRemovals) ... ok
test_optional_module_removals (test.test_py3kwarn.TestStdlibRemovals) ... ok
test_os_path_walk (test.test_py3kwarn.TestStdlibRemovals) ... ok
test_platform_independent_removals (test.test_py3kwarn.TestStdlibRemovals) ... ok
test_platform_specific_removals (test.test_py3kwarn.TestStdlibRemovals) ... /private/tmp/Python-2.7rc2/Lib/plat-mac/findertools.py:303: SyntaxWarning: tuple parameter unpacking has been removed in 3.x
  def _setlocation(object_alias, (x, y)):
/private/tmp/Python-2.7rc2/Lib/plat-mac/findertools.py:445: SyntaxWarning: tuple parameter unpacking has been removed in 3.x
  def _setwindowsize(folder_alias, (w, h)):
/private/tmp/Python-2.7rc2/Lib/plat-mac/findertools.py:496: SyntaxWarning: tuple parameter unpacking has been removed in 3.x
  def _setwindowposition(folder_alias, (x, y)):
ok
test_reduce_move (test.test_py3kwarn.TestStdlibRemovals) ... ok

----------------------------------------------------------------------
Ran 26 tests in 0.343s

OK
1 test OK.
%

The only failing test remaining, when run as a singleton, is test_urllib_localnet:

% ./python.exe -Wd -3 -E -tt ./Lib/test/regrtest.py -v test_urllib2_localnet
== CPython 2.7rc2 (r27rc2:82137, Jun 21 2010, 12:50:22) [GCC 4.0.1 (Apple Inc. build 5493)]
==   Darwin-9.8.0-i386-32bit little-endian
==   /private/tmp/Python-2.7rc2/build/test_python_58063
test_urllib2_localnet
test_proxy_qop_auth_int_works_or_throws_urlerror (test.test_urllib2_localnet.ProxyAuthTests) ... ok
test_proxy_qop_auth_works (test.test_urllib2_localnet.ProxyAuthTests) ... ok
test_proxy_with_bad_password_raises_httperror (test.test_urllib2_localnet.ProxyAuthTests) ... FAIL
test_proxy_with_no_password_raises_httperror (test.test_urllib2_localnet.ProxyAuthTests) ... FAIL
test_200 (test.test_urllib2_localnet.TestUrlopen) ... ok
test_200_with_parameters (test.test_urllib2_localnet.TestUrlopen) ... ok
test_404 (test.test_urllib2_localnet.TestUrlopen) ... ok
test_bad_address (test.test_urllib2_localnet.TestUrlopen) ... ok
test_basic (test.test_urllib2_localnet.TestUrlopen) ... ok
test_geturl (test.test_urllib2_localnet.TestUrlopen) ... ok
test_info (test.test_urllib2_localnet.TestUrlopen) ... ok
test_redirection (test.test_urllib2_localnet.TestUrlopen) ... ok
test_sending_headers (test.test_urllib2_localnet.TestUrlopen) ... ok

======================================================================
FAIL: test_proxy_with_bad_password_raises_httperror (test.test_urllib2_localnet.ProxyAuthTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/private/tmp/Python-2.7rc2/Lib/test/test_urllib2_localnet.py", line 264, in test_proxy_with_bad_password_raises_httperror
    self.URL)
AssertionError: HTTPError not raised

======================================================================
FAIL: test_proxy_with_no_password_raises_httperror (test.test_urllib2_localnet.ProxyAuthTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/private/tmp/Python-2.7rc2/Lib/test/test_urllib2_localnet.py", line 270, in test_proxy_with_no_password_raises_httperror
    self.URL)
AssertionError: HTTPError not raised

----------------------------------------------------------------------
Ran 13 tests in 9.050s

FAILED (failures=2)
test test_urllib2_localnet failed -- multiple errors occurred
1 test failed:
    test_urllib2_localnet
%

Bill

From regebro at gmail.com  Mon Jun 21 22:55:41 2010
From: regebro at gmail.com (Lennart Regebro)
Date: Mon, 21 Jun 2010 22:55:41 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvjlpt$8pe$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com> 
	<hvjlpt$8pe$1@dough.gmane.org>
Message-ID: <AANLkTikfpz9VYG6tx4bJIxLRHQuaWD4ijTCU5PjFjGle@mail.gmail.com>

On Sun, Jun 20, 2010 at 02:02, Terry Reedy <tjreedy at udel.edu> wrote:
> After reading the discussion in the previous thread, signed in to #python
> and verified that the intro message starts with a lie about python3. I also
> verified that the official #python site links to "Python Commandment Don't
> use Python 3? yet".

Well, it *should* say: "If you need to ask if you should use Python 2
or Python 3, you probably are better off with Python 2 for the
moment". But that's a bit long. :-)

-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python3porting.com/
+33 661 58 14 64

From regebro at gmail.com  Mon Jun 21 23:03:08 2010
From: regebro at gmail.com (Lennart Regebro)
Date: Mon, 21 Jun 2010 23:03:08 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com> 
	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com> 
	<hvlcl8$cjv$1@dough.gmane.org>
	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com> 
	<AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
Message-ID: <AANLkTilPYrRmkJUXEBCyg8IynQ91PqsiAQtmz_vMC9zb@mail.gmail.com>

On Sun, Jun 20, 2010 at 18:20, Laurens Van Houtven <lvh at laurensvh.be> wrote:
> 2.x or 3.x? http://tinyurl.com/py2or3

Wow. That's almost not an improvement... That link doesn't really help
anyone choose at all.

-- 
Lennart Regebro: Python, Zope, Plone, Grok
http://regebro.wordpress.com/
+33 661 58 14 64

From martin at v.loewis.de  Mon Jun 21 23:12:54 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 21 Jun 2010 23:12:54 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FC7E6.5070707@voidspace.org.uk>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk>
Message-ID: <4C1FD5D6.7070007@v.loewis.de>

> If OS X is a supported and important platform for Python then fixing all
> problems that it reveals (or being willing to) should definitely not be
> a pre-requisite of providing a buildbot (which is already a service to
> the Python developer community). Fixing bugs / failures revealed by
> Bill's buildbot is not fixing them "for Bill" it is fixing them for Python.

I wish people would stop using the word "supported" when they talk about 
free software. *No* system is "supported" by Python - not even in the 
sense "we strive to pass the test suite". "We" don't.

Now, one may argue whether failing buildbots should be an unconditional 
reason to defer the release. I personally would say "no", despite what 
some PEP may say. People proposing that a release is postponed typically 
hope that somebody gets frustrated enough to step up and fix the bug, 
just so that the software gets released.

Instead, I would propose that the only way to delay a release is by 
proposing to take some specific action to remedy the situation that 
should cause the delay. Otherwise, releasing is at the discretion of the 
release manager, who has the ultimate say to whether the problem is 
important or not.

As for OSX, it seems that the only test that is failing is the ctypes 
test suite, and there only a single test. I don't think this is 
sufficient reason to block the release.

Regards,
Martin

From janssen at parc.com  Mon Jun 21 23:13:47 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 14:13:47 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <1277151926.3369.6.camel@localhost.localdomain>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk>
	<1277151926.3369.6.camel@localhost.localdomain>
Message-ID: <95708.1277154827@parc.com>

Antoine Pitrou <solipsis at pitrou.net> wrote:

> OS X is only "a supported and important platform" if we have dedicated
> core developers diagnosing or even fixing issues for it (like we
> obviously have for Windows and Linux). Otherwise, I don't think we have
> any moral obligation to support it.

Fair enough.

That being said, there are two classes of OS X issues.  The first is the
kind of thing that Ronald Oussoren and Ned Deily keep fixing for us,
which require a knowledge of OS X frameworks and SDKs and various other
deeply-Apple oddnesses.  But the second class is a set of UNIX issues,
where OS X is just a variant of UNIX with minor differences from other
UNIX platforms.

It looks to me as if we don't really need Apple geeks for the second
class of issues, we just need developers who have a Mac to test on.

It looks to me, for instance, as if the failures in test_py3kwarn and
test_uuid on Leopard are bugs in the Python testing framework that
happen to be exercised on OS X, rather than bugs caused in some way by
the platform.  There, the requisite knowledge is, how does regrtest.py
really work?

Bill

From martin at v.loewis.de  Mon Jun 21 23:16:31 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 21 Jun 2010 23:16:31 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FC144.70600@voidspace.org.uk>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<AANLkTinKcYtbrt9V5i18cT5mBD5A-rC6QWyY10TQ7cbi@mail.gmail.com>
	<4C1FC144.70600@voidspace.org.uk>
Message-ID: <4C1FD6AF.6050804@v.loewis.de>

Am 21.06.2010 21:45, schrieb Michael Foord:
> On 21/06/2010 20:30, Benjamin Peterson wrote:
>> 2010/6/21 Bill Janssen<janssen at parc.com>:
>>> They are at the end of the buildbot list, so off-screen if you are using
>>> a normal browser. You have to scroll to see them.
>> But not on the "stable" view and that's the only one I look at.
>>
>
> What are the requirements for moving the OS X buildbots into the stable
> view? Are the builders themselves stable enough? (If the requirement is
> that the buildbots be green then it is something of a catch-22.)

It is indeed the latter (at least, how I understand it). The builder 
should "usually" give green, which means it should have done so over 
some extended period of time. If it then gets broken it means that 
somebody actually broke the code, rather than the system showing one of 
its glitches.

So asking for addition to the stable list *while* the slave is red is a 
bad idea.

FWIW, nobody has requested changing any of the build slaves to "stable" 
for the last two years or so.

Regards,
Martin


From fuzzyman at voidspace.org.uk  Mon Jun 21 23:23:23 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 21 Jun 2010 22:23:23 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FD5D6.7070007@v.loewis.de>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
Message-ID: <4C1FD84B.3030202@voidspace.org.uk>

On 21/06/2010 22:12, "Martin v. L?wis" wrote:
>> If OS X is a supported and important platform for Python then fixing all
>> problems that it reveals (or being willing to) should definitely not be
>> a pre-requisite of providing a buildbot (which is already a service to
>> the Python developer community). Fixing bugs / failures revealed by
>> Bill's buildbot is not fixing them "for Bill" it is fixing them for 
>> Python.
>
> I wish people would stop using the word "supported" when they talk 
> about free software. *No* system is "supported" by Python - not even 
> in the sense "we strive to pass the test suite". "We" don't.
>

Well, for better or for worse I think "we" do. We certainly *strive* to 
support these platforms and having the buildbots is a big part of this.

> Now, one may argue whether failing buildbots should be an 
> unconditional reason to defer the release. I personally would say 
> "no", despite what some PEP may say. People proposing that a release 
> is postponed typically hope that somebody gets frustrated enough to 
> step up and fix the bug, just so that the software gets released.
>
> Instead, I would propose that the only way to delay a release is by 
> proposing to take some specific action to remedy the situation that 
> should cause the delay. Otherwise, releasing is at the discretion of 
> the release manager, who has the ultimate say to whether the problem 
> is important or not.
>

I would agree with leaving it to the discretion of the release manager 
and we should aim for rather than hard require all stable buildbots to 
be green. I would still *expect* that a release manager would look at 
the stable buildbots before cutting a release.


> As for OSX, it seems that the only test that is failing is the ctypes 
> test suite, and there only a single test. I don't think this is 
> sufficient reason to block the release.
>
Bill listed several other failures he saw on the buildbots and I see the 
same set, plus test_posix.

All the best,

Michael

> Regards,
> Martin


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From simon at ikanobori.jp  Mon Jun 21 23:26:13 2010
From: simon at ikanobori.jp (Simon de Vlieger)
Date: Mon, 21 Jun 2010 23:26:13 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilPYrRmkJUXEBCyg8IynQ91PqsiAQtmz_vMC9zb@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>
	<hvlcl8$cjv$1@dough.gmane.org>
	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>
	<AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
	<AANLkTilPYrRmkJUXEBCyg8IynQ91PqsiAQtmz_vMC9zb@mail.gmail.com>
Message-ID: <4557027D-4EF9-4A4B-B816-3004DFB78F2A@ikanobori.jp>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 21 jun 2010, at 23:03, Lennart Regebro wrote:

> On Sun, Jun 20, 2010 at 18:20, Laurens Van Houtven  
> <lvh at laurensvh.be> wrote:
>> 2.x or 3.x? http://tinyurl.com/py2or3
>
> Wow. That's almost not an improvement... That link doesn't really help
> anyone choose at all.

Lennart,

That part of the topic will be replaced after all feedback is gathered  
on the new article Laurens provided at: http://python-commandments.org/python3.html 
  as stated earlier in this thread.

Regards,

Simon de Vlieger
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (Darwin)

iQIcBAEBAgAGBQJMH9j1AAoJEBBSHP7i+JXf8qQP/1w6Esl/x6S5+4lDqykx0R7w
M9v6x8G2JvnthTkzh2hF76vruLc4e3SNs1QVCmirh5vjdkRHneJQ/2w/dRVKLi2b
/tayYg5QyzjPL37wiAarRnsr7SSiwFgEUCHWZVAAw0dRvszYF/CoLmxTs8TQWs8o
KnRuwO4UHuXvtarqO8JeY6gMR4bwcdEXHVNqdRK+PSoRXH9IVJky6IcqwtTC0bzf
vyLlQZmVdiXIXvjYOxNQgoufmsC74daqqodzhxtCn2WTHSN2s1ws/gkxBqe+NZPz
zYlAukVSiLz/YMcK3NGZYukseT8ZBGiNMuhPVt3lb4SY2LnKVRUiYqNCp9wpWCr/
ASmjaZDU0Dz5I+PHSNCWC4NHyTNClPy3b4b9y3LJ/6hpNZaC3wGHTX5IDxQKjt5u
ajEgzstM2wuZDtVNQhcADHk2KWBsCoaE9c0tXKz40T7nIq15zbbGqhyTXjmyouLB
JoonSPbS5Ap1UY6RGWEt6t3ZdVDDnMwJzL/DBMOiMgWZIVf7B6/VPy0j9jV9U0WV
Sx+U5WnaYqKYo+ZkRTg1iI6dPuK5GTGph+2gzjdTHRVMFFPETxkFz/pBZJG4DOHq
bkaKG2IFMWB+Ua9GrTJTbfmTP3YzgJwBG34ZWRLFSQu7zJaY1JdQqQK7z+SCJ5Lg
toMEpj7z8KxfUAF84xBG
=hTod
-----END PGP SIGNATURE-----

From foom at fuhm.net  Mon Jun 21 22:54:06 2010
From: foom at fuhm.net (James Y Knight)
Date: Mon, 21 Jun 2010 16:54:06 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C1FCB99.1090102@egenix.com>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>	<20100621114307.48735698@heresy>
	<20100621163404.GV5787@unaka.lan>
	<20100621155550.643d27b8@heresy> <4C1FCB99.1090102@egenix.com>
Message-ID: <1E87B24F-FE0A-4C97-B895-FB15022DA2A0@fuhm.net>

On Jun 21, 2010, at 4:29 PM, M.-A. Lemburg wrote:
> Here's a little known fact: by changing the Python2 default
> encoding to 'undefined' (yes, that's a real codec !), you can disable
> all automatic string coercion in Python2.

I tried that once: half the stdlib stops working if you do (for  
example, the re module), so it's not particularly useful for checking  
if your own code is unicode-safe.

James

From regebro at gmail.com  Mon Jun 21 23:29:31 2010
From: regebro at gmail.com (Lennart Regebro)
Date: Mon, 21 Jun 2010 23:29:31 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <4557027D-4EF9-4A4B-B816-3004DFB78F2A@ikanobori.jp>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com> 
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com> 
	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com> 
	<hvlcl8$cjv$1@dough.gmane.org>
	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com> 
	<AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com> 
	<AANLkTilPYrRmkJUXEBCyg8IynQ91PqsiAQtmz_vMC9zb@mail.gmail.com> 
	<4557027D-4EF9-4A4B-B816-3004DFB78F2A@ikanobori.jp>
Message-ID: <AANLkTinW33AdlBGbrnyo-t5-yonjEOrwHUjeJfEOM7mx@mail.gmail.com>

On Mon, Jun 21, 2010 at 23:26, Simon de Vlieger <simon at ikanobori.jp> wrote:
> That part of the topic will be replaced after all feedback is gathered on
> the new article Laurens provided at:
> http://python-commandments.org/python3.html as stated earlier in this
> thread.

OK, great, I missed that!

-- 
Lennart Regebro: Python, Zope, Plone, Grok
http://regebro.wordpress.com/
+33 661 58 14 64

From martin at v.loewis.de  Mon Jun 21 23:36:37 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Mon, 21 Jun 2010 23:36:37 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FD84B.3030202@voidspace.org.uk>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk>
Message-ID: <4C1FDB65.4020503@v.loewis.de>

> Bill listed several other failures he saw on the buildbots and I see the
> same set, plus test_posix.

Still, the question would be whether any of these failures can manage to 
block a release. Are they regressions from 2.6? That would make them 
good candidates for release blockers. Except that I still would like to 
see commitment from somebody to fix them or else they can't block the 
release: if "we" don't mean that supporting a platform also means 
volunteering to fix bugs, then I guess "we" should stop declaring the
platform supported. Just wishing that it was supported actually doesn't 
make it so.

If the test failure *isn't* a regression, I think it shouldn't block the 
release.

Regards,
Martin


From a.badger at gmail.com  Mon Jun 21 23:41:19 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 17:41:19 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621201006.5A3223A404D@sparrow.telecommunity.com>
References: <20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621192952.GZ5787@unaka.lan>
	<20100621201006.5A3223A404D@sparrow.telecommunity.com>
Message-ID: <20100621214119.GB5787@unaka.lan>

On Mon, Jun 21, 2010 at 04:09:52PM -0400, P.J. Eby wrote:
> At 03:29 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> >On Mon, Jun 21, 2010 at 01:24:10PM -0400, P.J. Eby wrote:
> >> At 12:34 PM 6/21/2010 -0400, Toshio Kuratomi wrote:
> >> >What do you think of making the encoding attribute a mandatory part of
> >> >creating an ebyte object?  (ex: ``eb = ebytes(b, 'euc-jp')``).
> >>
> >> As long as the coercion rules force str+ebytes (or str % ebytes,
> >> ebytes % str, etc.) to result in another ebytes (and fail if the str
> >> can't be encoded in the ebytes' encoding), I'm personally fine with
> >> it, although I really like the idea of tacking the encoding to bytes
> >> objects in the first place.
> >>
> >I wouldn't like this.  It brings us back to the python2 problem where
> >sometimes you pass an ebyte into a function and it works and other times you
> >pass an ebyte into the function and it issues a traceback.
> 
> For stdlib functions, this isn't going to happen unless your ebytes'
> encoding is not compatible with the ascii subset of unicode, or the
> stdlib function is working with dynamic data...  in which case you
> really *do* want to fail early!
> 
The ebytes encoding will often be incompatible with the ascii subset.
It's the reason that people were so often tempted to change the
defaultencoding on python2 to utf8.

> I don't see this as a repeat of the 2.x situation; rather, it allows
> you to cause errors to happen much *earlier* than they would
> otherwise show up if you were using unicode for your encoded-bytes
> data.
> 
> For example, if your program's intent is to end up with latin-1
> output, then it would be better for an error to show up at the very
> *first* point where non-latin1 characters are mixed with your data,
> rather than only showing up at the output boundary!
> 
That highly depends on your usage.  If you're formatting a comment on a web
page, checking at output and replacing with '?' is better than a traceback.
If you're entering key values into a database, then you likely want to know
where the non-latin1 data is entering your program, not where it's mixed
with your data or the output boundary.

> However, if you promoted mixed-type operation results to unicode
> instead of ebytes, then you:
> 
> 1) can't preserve data that doesn't have a 1:1 mapping to unicode, and
> 
ebytes should be immutable like bytes and str.  So you shouldn't lose the
data if you keep a reference to it.

> 2) can't detect an error until your data reaches the output point in
> your application -- forcing you to defensively insert ebytes calls
> everywhere (vs. simply wrapping them around a handful of designated
> inputs), or else have to go right back to tracing down where the
> unusable data showed up in the first place.
> 
Usually, you don't want to know where you are combining two incompatible
strings.  Instead, you want to know where the incompatible strings are being
set in the first place.  If function(a, b) tracebacks with certain
combinations of a and b I need to know where a and b are being set, not
where function(a, b) is in the source code.  So you need to be making input
values ebytes() (or str in current python3) no matter what.

> One thing that seems like a bit of a blind spot for some folks is
> that having unicode is *not* everybody's goal.  Not because we don't
> believe unicode is generally a good thing or anything like that, but
> because we have to work with systems that flat out don't *do*
> unicode, thereby making the presence of (fully-general) unicode an
> error condition that has to be stamped out!
> 
I think that sometimes as well.  However, here I think you're in a bit of
a blind spot yourself.  I'm saying that making ebytes + str coerce to ebytes
will only yield a traceback some of the time; which is the python2
behaviour.  Having ebytes + str coerce to str will never throw a traceback
as long as our implementation checks that the bytes and encoding work
together fro mthe start.

Throwing an error in code, only on some input is one of the main reasons
that debugging unicode vs byte issues sucks on python2.  On my box, with my
dataset, everything works.  Toss it up on pypi and suddenly I have a user in
Japan who reports that he gets a traceback with his dataset that he can't
give to me because it's proprietary, overly large, or transient.


> IOW, if you're producing output that has to go into another system
> that doesn't take unicode, it doesn't matter how
> theoretically-correct it would be for your app to process the data in
> unicode form.  In that case, unicode is not a feature: it's a bug.
> 
This is not always true.  If you read a webpage, chop it up so you get
a list of words, create a histogram of word length, and then write the output as
utf8 to a database.  Should you do all your intermediate string operations
on utf8 encoded byte strings?  No, you should do them on unicode strings as
otherwise you need to know about the details of how utf8 encodes characters.

> And as it really *is* an error in that case, it should not pass
> silently, unless explicitly silenced.
> 
This is very true -- although the python3 stdlib does explicitly silence
errors related to unicode in some cases.

Anyhow -- IMHO, you should get a TypeError when you attempt to pass
a unicode value into a function that is meant to work with bytes.  (You can
accept an ebytes object as well since it has a known bytes representation).

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/be3dcc6c/attachment-0001.pgp>

From lvh at laurensvh.be  Mon Jun 21 23:41:59 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 21 Jun 2010 23:41:59 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <AANLkTilPYrRmkJUXEBCyg8IynQ91PqsiAQtmz_vMC9zb@mail.gmail.com>
References: <20100618050712.GC20639@thorne.id.au>
	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>
	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>
	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>
	<hvjlpt$8pe$1@dough.gmane.org>
	<4A9DD603-3A5F-4A8A-A37C-21C88ABC2A43@twistedmatrix.com>
	<hvlcl8$cjv$1@dough.gmane.org>
	<AANLkTim_opwujuTwVpQLdPo7I1HT1D6-tzYimZtJyJco@mail.gmail.com>
	<AANLkTimjOd5cGsNKP7bRaBYZRboMdnuGXQek4cDfGu1x@mail.gmail.com>
	<AANLkTilPYrRmkJUXEBCyg8IynQ91PqsiAQtmz_vMC9zb@mail.gmail.com>
Message-ID: <AANLkTileuAeWnJ3y8RZsSZkqbwKLCiC-AKsLq2RmZzqM@mail.gmail.com>

On Mon, Jun 21, 2010 at 11:03 PM, Lennart Regebro <regebro at gmail.com> wrote:
> On Sun, Jun 20, 2010 at 18:20, Laurens Van Houtven <lvh at laurensvh.be> wrote:
>> 2.x or 3.x? http://tinyurl.com/py2or3
>
> Wow. That's almost not an improvement... That link doesn't really help
> anyone choose at all.
>
> --
> Lennart Regebro: Python, Zope, Plone, Grok
> http://regebro.wordpress.com/
> +33 661 58 14 64
>

Please read the rest of the thread: that's ancient information and no
longer the latest work. We just removed the thing that offended
people, so that the situation could be defused instantly and then we
could work towards something everyone liked in a calm and productive
environment.

Laurens

From john.arbash.meinel at gmail.com  Mon Jun 21 23:52:08 2010
From: john.arbash.meinel at gmail.com (John Arbash Meinel)
Date: Mon, 21 Jun 2010 16:52:08 -0500
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621214119.GB5787@unaka.lan>
References: <20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>	<20100621114307.48735698@heresy>
	<20100621163404.GV5787@unaka.lan>	<20100621172413.578853A404D@sparrow.telecommunity.com>	<20100621192952.GZ5787@unaka.lan>	<20100621201006.5A3223A404D@sparrow.telecommunity.com>
	<20100621214119.GB5787@unaka.lan>
Message-ID: <4C1FDF08.3010401@gmail.com>


...
>> IOW, if you're producing output that has to go into another system
>> that doesn't take unicode, it doesn't matter how
>> theoretically-correct it would be for your app to process the data in
>> unicode form.  In that case, unicode is not a feature: it's a bug.
>>
> This is not always true.  If you read a webpage, chop it up so you get
> a list of words, create a histogram of word length, and then write the output as
> utf8 to a database.  Should you do all your intermediate string operations
> on utf8 encoded byte strings?  No, you should do them on unicode strings as
> otherwise you need to know about the details of how utf8 encodes characters.
> 

You'd still have problems in Unicode given stuff like ? =~ a? even though
u'\xe5' vs u'a\u030a' (those will look the same depending on your
Unicode system. IDLE shows them pretty much the same, T-Bird on Windosw
with my current font shows the second as 2 characters.)

I realize this was a toy example, but it does point out that Unicode
complicates the idea of 'equality' as well as the idea of 'what is a
character'. And just saying "decode it to Unicode" isn't really sufficient.

John
=:->


From fuzzyman at voidspace.org.uk  Mon Jun 21 23:52:28 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 21 Jun 2010 22:52:28 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FDB65.4020503@v.loewis.de>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
Message-ID: <4C1FDF1C.2060308@voidspace.org.uk>

On 21/06/2010 22:36, "Martin v. L?wis" wrote:
>> Bill listed several other failures he saw on the buildbots and I see the
>> same set, plus test_posix.
>
> Still, the question would be whether any of these failures can manage 
> to block a release. Are they regressions from 2.6?

The test_posix failure is a regression from 2.6 (but it only shows up on 
some machines - it is caused by a fairly braindead implementation of a 
couple of posix apis by Apple apparently).

     http://bugs.python.org/issue7900

There are various patches available and a lot of work that has gone into 
diagnosing it - but there was some disagreement on what is the *best* 
way to fix it.

Two of the other failures I'm pretty sure are problems in the test suite 
rather than bugs (as Bill said) and I'm not sure about the ctypes issue. 
Just starting a full build here.

Michael
> That would make them good candidates for release blockers. Except that 
> I still would like to see commitment from somebody to fix them or else 
> they can't block the release: if "we" don't mean that supporting a 
> platform also means volunteering to fix bugs, then I guess "we" should 
> stop declaring the
> platform supported. Just wishing that it was supported actually 
> doesn't make it so.
>
> If the test failure *isn't* a regression, I think it shouldn't block 
> the release.
>
> Regards,
> Martin
>
>
>


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From fuzzyman at voidspace.org.uk  Mon Jun 21 23:57:04 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 21 Jun 2010 22:57:04 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FDF1C.2060308@voidspace.org.uk>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>	<4C1FC7E6.5070707@voidspace.org.uk>
	<4C1FD5D6.7070007@v.loewis.de>	<4C1FD84B.3030202@voidspace.org.uk>
	<4C1FDB65.4020503@v.loewis.de> <4C1FDF1C.2060308@voidspace.org.uk>
Message-ID: <4C1FE030.7020700@voidspace.org.uk>

On 21/06/2010 22:52, Michael Foord wrote:
> On 21/06/2010 22:36, "Martin v. L?wis" wrote:
>>> Bill listed several other failures he saw on the buildbots and I see 
>>> the
>>> same set, plus test_posix.
>>
>> Still, the question would be whether any of these failures can manage 
>> to block a release. Are they regressions from 2.6?
>
> The test_posix failure is a regression from 2.6 (but it only shows up 
> on some machines - it is caused by a fairly braindead implementation 
> of a couple of posix apis by Apple apparently).
>
>     http://bugs.python.org/issue7900
>
> There are various patches available and a lot of work that has gone 
> into diagnosing it - but there was some disagreement on what is the 
> *best* way to fix it.
>
> Two of the other failures I'm pretty sure are problems in the test 
> suite rather than bugs (as Bill said) and I'm not sure about the 
> ctypes issue. Just starting a full build here.

Right now I'm *only* seeing these two failures on Mac OS X (10.6.4):

     test_posix test_urllib2_localnet

All the best,

Michael

>
> Michael
>> That would make them good candidates for release blockers. Except 
>> that I still would like to see commitment from somebody to fix them 
>> or else they can't block the release: if "we" don't mean that 
>> supporting a platform also means volunteering to fix bugs, then I 
>> guess "we" should stop declaring the
>> platform supported. Just wishing that it was supported actually 
>> doesn't make it so.
>>
>> If the test failure *isn't* a regression, I think it shouldn't block 
>> the release.
>>
>> Regards,
>> Martin
>>
>>
>>
>
>


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From ncoghlan at gmail.com  Tue Jun 22 00:03:58 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Jun 2010 08:03:58 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621201616.EADEF3A404D@sparrow.telecommunity.com>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621160420.63037f1c@heresy>
	<20100621201616.EADEF3A404D@sparrow.telecommunity.com>
Message-ID: <AANLkTimWNuT_sCJNoWN_vcN3QtqDmn4u-ELViBq3ONHv@mail.gmail.com>

On Tue, Jun 22, 2010 at 6:16 AM, P.J. Eby <pje at telecommunity.com> wrote:
> True, but making it a separate type with a required encoding gets rid of the
> magical "I don't know" - the "I don't know" encoding is just a plain old
> bytes object.

So, to boil down the ebytes idea, it is basically a request for a
second string type that holds an octet stream plus an encoding name,
rather than a Unicode character stream. Calling it "ebytes" seems to
emphasise the wrong parallel in that case (you have a 'str' object
with a different internal structure, not any kind of bytes object).
For now I'll call it an "altstr". Then the idea can be described as

- altstr would expose the same API as str, NOT the same API as bytes
- explicit conversion via "str" would use the altstr's __str__ method
- explicit conversion via "bytes" would use the altstr's __bytes__ method
- implicit interaction with str would convert the str to an altstr
object according to the altstr's rules. This may be best handled via a
coercion method on altstr, rather than str actually needing to know
the details (i.e. an altrstr.__coerce_str__() method). For the
'ebytes' model, this would do something like
"type(self)(other.encode(self.encoding), self.encoding))". The
operation would then be handled by the corresponding method on the
coerced object. A new type could then override operations such as
__contains__, __mod__, format() and join().

This is still smelling an awful lot like the 2.x str type to me, but
supporting a __coerce_str__ method may allow some useful
experimentation in this space (as PJE suggested). There's a chance it
would be abused, but it offers a greater chance of success than trying
to come up with a concrete altstr type without providing a means for
experimentation first.

> (In principle, you could then drop *all* the stringlike methods from
> plain-old-bytes objects. ?If it's really text-in-bytes you want, you should
> use an ebytes with the encoding specified.)

Except that a lot of those string-like methods are just plain useful,
even when you *know* you're dealing with an octet stream rather than
latin-1 encoded text.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From a.badger at gmail.com  Tue Jun 22 00:06:57 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Mon, 21 Jun 2010 18:06:57 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C1FDF08.3010401@gmail.com>
References: <AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621192952.GZ5787@unaka.lan>
	<20100621201006.5A3223A404D@sparrow.telecommunity.com>
	<20100621214119.GB5787@unaka.lan> <4C1FDF08.3010401@gmail.com>
Message-ID: <20100621220657.GC5787@unaka.lan>

On Mon, Jun 21, 2010 at 04:52:08PM -0500, John Arbash Meinel wrote:
> 
> ...
> >> IOW, if you're producing output that has to go into another system
> >> that doesn't take unicode, it doesn't matter how
> >> theoretically-correct it would be for your app to process the data in
> >> unicode form.  In that case, unicode is not a feature: it's a bug.
> >>
> > This is not always true.  If you read a webpage, chop it up so you get
> > a list of words, create a histogram of word length, and then write the output as
> > utf8 to a database.  Should you do all your intermediate string operations
> > on utf8 encoded byte strings?  No, you should do them on unicode strings as
> > otherwise you need to know about the details of how utf8 encodes characters.
> > 
> 
> You'd still have problems in Unicode given stuff like ? =~ a? even though
> u'\xe5' vs u'a\u030a' (those will look the same depending on your
> Unicode system. IDLE shows them pretty much the same, T-Bird on Windosw
> with my current font shows the second as 2 characters.)
> 
> I realize this was a toy example, but it does point out that Unicode
> complicates the idea of 'equality' as well as the idea of 'what is a
> character'. And just saying "decode it to Unicode" isn't really sufficient.
> 
Ah -- but if you're dealing with unicode objects you can use the
unicodedata.normalize() function on them to come out with the right values.
If you're using bytes, it's yet another case where you, the programmer, have
to know what byte sequences represent combining characters in the particular
encoding that you're dealing with.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/959b814b/attachment.pgp>

From martin at v.loewis.de  Tue Jun 22 00:16:15 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Tue, 22 Jun 2010 00:16:15 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FDF1C.2060308@voidspace.org.uk>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
Message-ID: <4C1FE4AF.80009@v.loewis.de>

> The test_posix failure is a regression from 2.6 (but it only shows up on
> some machines - it is caused by a fairly braindead implementation of a
> couple of posix apis by Apple apparently).
>
> http://bugs.python.org/issue7900

Ah, that one. I definitely think this should *not* block the release:
a) there is no clear solution in sight. So if we wait for it resolved,
    it could take months until we get a 2.7 release.
b) it's only about getgroups - a fairly minor API.
c) IIUC, it only occurs to users which are member of more than 16
    groups - a fairly uncommon setup.

Regards,
Martin

From ncoghlan at gmail.com  Tue Jun 22 00:18:20 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Jun 2010 08:18:20 +1000
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <20100621203756.2f99757f@pitrou.net>
References: <73196.1277143019@parc.com>
	<20100621203756.2f99757f@pitrou.net>
Message-ID: <AANLkTiknXYQ2kxu-V3Z-EpdU5UR8uaYhXTdtr_D9PeDA@mail.gmail.com>

> There also seem to be a couple of failures left with test_gdb...

Do you mean the compiler and debugger specific issues reported in
http://bugs.python.org/issue8482?

Fixing that properly is messy, and according to Victor's last message,
even the correct conditions for skipping the test aren't completely
clear.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From p.f.moore at gmail.com  Tue Jun 22 00:19:33 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Jun 2010 23:19:33 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FE030.7020700@voidspace.org.uk>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
Message-ID: <AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>

On 21 June 2010 22:57, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>> Two of the other failures I'm pretty sure are problems in the test suite
>> rather than bugs (as Bill said) and I'm not sure about the ctypes issue.
>> Just starting a full build here.
>
> Right now I'm *only* seeing these two failures on Mac OS X (10.6.4):
>
> ? ?test_posix test_urllib2_localnet

I'm still seeing a test_ctypes failure (on Windows XP). Not sure if
it's the same one Bill was seeing:

FAIL: test_issue_8959_b (ctypes.test.test_callbacks.SampleCallbacksTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "C:\buildslave\trunk.moore-windows\build\lib\ctypes\test\test_callbacks.py",
line 208, in test_issue_8959_b
    self.assertFalse(windowCount == 0)
AssertionError: True is not False

Looks like this test was added today, and counts the windows. As my
buildbot is running as a service, and I generally leave it running
when logged off, a window count of 0 may well be correct - I can't be
sure. So my view is that it's possibly a bug in the test - but it
could do with someone more expert to confirm this.

I've got a build running at the moment, when it's finished I'll rerun
the trunk build (I currently have a disconnected session with a window
open, I'll see if that makes it pass).

Paul.

From p.f.moore at gmail.com  Tue Jun 22 00:39:56 2010
From: p.f.moore at gmail.com (Paul Moore)
Date: Mon, 21 Jun 2010 23:39:56 +0100
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
	<AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
Message-ID: <AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>

On 21 June 2010 23:19, Paul Moore <p.f.moore at gmail.com> wrote:
> On 21 June 2010 22:57, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
>>> Two of the other failures I'm pretty sure are problems in the test suite
>>> rather than bugs (as Bill said) and I'm not sure about the ctypes issue.
>>> Just starting a full build here.
>>
>> Right now I'm *only* seeing these two failures on Mac OS X (10.6.4):
>>
>> ? ?test_posix test_urllib2_localnet
>
> I'm still seeing a test_ctypes failure (on Windows XP). Not sure if
> it's the same one Bill was seeing:
>
> FAIL: test_issue_8959_b (ctypes.test.test_callbacks.SampleCallbacksTestCase)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
> ?File "C:\buildslave\trunk.moore-windows\build\lib\ctypes\test\test_callbacks.py",
> line 208, in test_issue_8959_b
> ? ?self.assertFalse(windowCount == 0)
> AssertionError: True is not False
>
> Looks like this test was added today, and counts the windows. As my
> buildbot is running as a service, and I generally leave it running
> when logged off, a window count of 0 may well be correct - I can't be
> sure. So my view is that it's possibly a bug in the test - but it
> could do with someone more expert to confirm this.
>
> I've got a build running at the moment, when it's finished I'll rerun
> the trunk build (I currently have a disconnected session with a window
> open, I'll see if that makes it pass).

Yes, looks like it's a bug in the test. http://bugs.python.org/issue9055 raised.

Paul.

From stefan at bytereef.org  Tue Jun 22 00:37:11 2010
From: stefan at bytereef.org (Stefan Krah)
Date: Tue, 22 Jun 2010 00:37:11 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <94209.1277152580@parc.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<94209.1277152580@parc.com>
Message-ID: <20100621223711.GA25865@yoda.bytereef.org>

Bill Janssen <janssen at parc.com> wrote:
> % make test
> [...]
> test_uuid
> test test_uuid failed -- Traceback (most recent call last):
>   File "/private/tmp/Python-2.7rc2/Lib/test/test_uuid.py", line 472, in testIssue8621
>     self.assertNotEqual(parent_value, child_value)
> AssertionError: '8395a08e40454895be537a180539b7fb' == '8395a08e40454895be537a180539b7fb'
> 
> [...]

I reopened http://bugs.python.org/issue8621 . Could you comment there
and help resolve the test failure?


Stefan Krah


From tjreedy at udel.edu  Tue Jun 22 00:48:12 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 18:48:12 -0400
Subject: [Python-Dev] Adding additional level of bookmarks and section
 numbers in python pdf documents.
In-Reply-To: <AANLkTikMD8ZXuA50e-QA7QRmMQg0qv6YNc7M-AKmfDp6@mail.gmail.com>
References: <AANLkTikMD8ZXuA50e-QA7QRmMQg0qv6YNc7M-AKmfDp6@mail.gmail.com>
Message-ID: <hvoq7c$66o$1@dough.gmane.org>

On 6/21/2010 4:07 PM, Peng Yu wrote:
> Hi,
>
> Current pdf version of python documents don't have bookmarks for
> sussubsection. For example, there is no bookmark for the following
> section in python_2.6.5_reference.pdf. Also the bookmarks don't have
> section numbers in them. I suggest to include the section numbers.
> Could these features be added in future release of python document.
>
> 3.4.1 Basic customization

Search doc issues on the tracker for this topic and file a feature 
request doc issue if there is not one.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue Jun 22 01:01:09 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 19:01:09 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvogc0$tpt$2@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>	<hvm6cu$gaq$1@dough.gmane.org>	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>	<hvo3ln$55n$1@dough.gmane.org>
	<hvogc0$tpt$2@dough.gmane.org>
Message-ID: <hvoqvm$a8o$1@dough.gmane.org>

On 6/21/2010 3:59 PM, Steve Holden wrote:
> Terry Reedy wrote:
>> On 6/21/2010 8:33 AM, Nick Coghlan wrote:
>>
>>> P.S. (We're going to have a tough decision to make somewhere along the
>>> line where docs.python.org is concerned, too - when do we flick the
>>> switch and make a 3.x version of the docs the default?
>>
>> Easy. When 3.2 is released. When 2.7 is released, 3.2 becomes 'trunk'.
>> Trunk released always take over docs.python.org. To do otherwise would
>> be to say that 3.2 is not a real trunk release and not yet ready for
>> real use -- a major slam.
>>
>> Actually, I thought this was already discussed and decided ;-).
>>
> This also gives the 2.7 release it's day in the sun before relegation to
> maintenance status.

Every new version (except 3.0 and 3.1) has gone to maintenance status 
*and* becomes the featured release on docs.python.org the day it was 
released.  2.7 would just spend less time as the featured release on 
that page.

> The Python 3 documents, when they become the default, should contain an
> every-page link to the Python 2 documentation (though linkages may be a
> problem - they could probably be done at a gross level).

docs.python.org contains links to docs to other releases, both past and 
future. There is no reason to treat 3.2 specially, or to junk up its 
pages. The 3.x docs have intentionally been cleaned of nearly all 
references to 2.x. The current 2.6 and 2.7 pages have no references to 
corresponding 3.1 pages.

Terry Jan Reedy


From barry at python.org  Tue Jun 22 01:12:57 2010
From: barry at python.org (Barry Warsaw)
Date: Mon, 21 Jun 2010 19:12:57 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTimWNuT_sCJNoWN_vcN3QtqDmn4u-ELViBq3ONHv@mail.gmail.com>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy> <20100621163404.GV5787@unaka.lan>
	<20100621172413.578853A404D@sparrow.telecommunity.com>
	<20100621160420.63037f1c@heresy>
	<20100621201616.EADEF3A404D@sparrow.telecommunity.com>
	<AANLkTimWNuT_sCJNoWN_vcN3QtqDmn4u-ELViBq3ONHv@mail.gmail.com>
Message-ID: <20100621191257.698ae6cc@heresy>

On Jun 22, 2010, at 08:03 AM, Nick Coghlan wrote:

>On Tue, Jun 22, 2010 at 6:16 AM, P.J. Eby <pje at telecommunity.com> wrote:
>> True, but making it a separate type with a required encoding gets rid of the
>> magical "I don't know" - the "I don't know" encoding is just a plain old
>> bytes object.
>
>So, to boil down the ebytes idea, it is basically a request for a
>second string type that holds an octet stream plus an encoding name,
>rather than a Unicode character stream. Calling it "ebytes" seems to
>emphasise the wrong parallel in that case (you have a 'str' object
>with a different internal structure, not any kind of bytes object).
>For now I'll call it an "altstr". Then the idea can be described as

Actually no.  We're introducing a second bytes type that holds an octet stream
plus an encoding name.  See the toy implementation I included in a previous
message.

As opposed to say a bytes object that represented an image, which would make
almost no sense to decode to a unicode, this ebytes type would help bridge the
gap between a pure bytes object and a pure unicode object.  It would know how
to accurately convert to a unicode (i.e. __str__()) because it would know the
encoding of the bytes.  Obviously, it could convert to a pure bytes object.
Because it can be accurately stringified, it can have the most if not all of
the str API.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/636cb532/attachment.pgp>

From db3l.net at gmail.com  Tue Jun 22 01:17:48 2010
From: db3l.net at gmail.com (David Bolen)
Date: Mon, 21 Jun 2010 19:17:48 -0400
Subject: [Python-Dev] red buildbots on 2.7
References: <73196.1277143019@parc.com>
	<AANLkTinHh0pBEX27beeI6fwtPdGoFxfK8MHzumSDP3A_@mail.gmail.com>
Message-ID: <m2wrts3rhf.fsf@valheru.db3l.homeip.net>

Paul Moore <p.f.moore at gmail.com> writes:

> Thanks for the alert. I've killed the stuck test and should see some
> runs going through now. Shame, really, I was getting used to seeing a
> nice page of all green results...

In my experience, my OSX and Windows buildbots need some manual TLC on
an ongoing basis.  I kill off stranded python processes several times
a week on both platforms.  OSX actually seems as bad as Windows in
this regard, which is strange given its *nix heritage, but perhaps its
how some of the test processes are created.

Most of the time the stranded processes aren't hurting anything but
local resource, but sometimes they can lock directories, or hang a
build/test for a particular builder.

My windows buildbots also have a tendency to fill up temp, or even if
there's room, get sluggish due to all the cruft left in that
directory, so I periodically clean that out manually as well.

-- David


From steve at pearwood.info  Tue Jun 22 01:23:28 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Jun 2010 09:23:28 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621201006.5A3223A404D@sparrow.telecommunity.com>
References: <AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<20100621192952.GZ5787@unaka.lan>
	<20100621201006.5A3223A404D@sparrow.telecommunity.com>
Message-ID: <201006220923.28378.steve@pearwood.info>

On Tue, 22 Jun 2010 06:09:52 am P.J. Eby wrote:
> However, if you promoted mixed-type operation results to unicode
> instead of ebytes, then you:
>
> 1) can't preserve data that doesn't have a 1:1 mapping to unicode,

Sounds like exactly the sort of thing the Unicode private codepoints 
were invented for, as Toshio suggests.

In any case, if there are use-cases for text that aren't solved by 
Unicode, and I'm not convinced that there are, Python doesn't need to 
solve them. At the very least, such a solution should start off as a 
third-party package to prove itself before being made a part of the 
standard library, let alone a built-in.


-- 
Steven D'Aprano

From steve at pearwood.info  Tue Jun 22 01:27:31 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Jun 2010 09:27:31 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTimWNuT_sCJNoWN_vcN3QtqDmn4u-ELViBq3ONHv@mail.gmail.com>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621201616.EADEF3A404D@sparrow.telecommunity.com>
	<AANLkTimWNuT_sCJNoWN_vcN3QtqDmn4u-ELViBq3ONHv@mail.gmail.com>
Message-ID: <201006220927.31875.steve@pearwood.info>

On Tue, 22 Jun 2010 08:03:58 am Nick Coghlan wrote:
> On Tue, Jun 22, 2010 at 6:16 AM, P.J. Eby <pje at telecommunity.com> 
wrote:
> > True, but making it a separate type with a required encoding gets
> > rid of the magical "I don't know" - the "I don't know" encoding is
> > just a plain old bytes object.
>
> So, to boil down the ebytes idea, it is basically a request for a
> second string type that holds an octet stream plus an encoding name,
> rather than a Unicode character stream.

Do any other languages have any equivalent to this ebtyes type?

If not, how do they deal with this issue?

[...]
> This is still smelling an awful lot like the 2.x str type to me

Yes. Virtually the only difference I can see is that it lets the user 
set a per-object default encoding to use when coercing strings to and 
from bytes.

If this is not the case, can somebody please explain what I'm missing?


-- 
Steven D'Aprano

From tjreedy at udel.edu  Tue Jun 22 01:48:46 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 19:48:46 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621172957.EB55C3A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>	<20100621164650.16A093A414B@sparrow.telecommunity.com>	<4C1F9833.2080905@voidspace.org.uk>
	<20100621172957.EB55C3A404D@sparrow.telecommunity.com>
Message-ID: <hvotov$hmm$1@dough.gmane.org>

On 6/21/2010 1:29 PM, P.J. Eby wrote:
> At 05:49 PM 6/21/2010 +0100, Michael Foord wrote:
>> Why is your proposed bstr wrapper not practical to implement outside
>> the core and use in your own libraries and frameworks?
>
> __contains__ doesn't have a converse operation, so you can't code a type
> that works around this (Python 3.1 shown):
>
>  >>> from os.path import join
>  >>> join(b'x','y')

>  >>> join('y',b'x')

I am really unclear what result you intend for such mixed pairs, for all 
possible mixed pairs, sensible or not. It would seem to me best to write 
your own pjoin function that did exactly what you want over the whole 
input domain.

-- 
Terry Jan Reedy


From nyamatongwe at gmail.com  Tue Jun 22 01:49:52 2010
From: nyamatongwe at gmail.com (Neil Hodgson)
Date: Tue, 22 Jun 2010 09:49:52 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <201006220927.31875.steve@pearwood.info>
References: <87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621201616.EADEF3A404D@sparrow.telecommunity.com>
	<AANLkTimWNuT_sCJNoWN_vcN3QtqDmn4u-ELViBq3ONHv@mail.gmail.com>
	<201006220927.31875.steve@pearwood.info>
Message-ID: <AANLkTime9bWFsvLTtYkUv3XI9JX04oFUK8LWXLpbnuTw@mail.gmail.com>

Steven D'Aprano:

> Do any other languages have any equivalent to this ebtyes type?

   The String type in Ruby 1.9 is a byte string with an encoding attribute.

   Most online Ruby documentation is for 1.8 but the API can be examined here:
http://ruby-doc.org/ruby-1.9/index.html
   Here's something more explanatory:
http://blog.grayproductions.net/articles/ruby_19s_string

   My view is that this actually makes things much more complex by
making encoding combination an n*n problem (where n is the number of
encodings) rather an n sized problem when you have a single core
string type

   Neil

From tjreedy at udel.edu  Tue Jun 22 01:55:59 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 19:55:59 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
	<20100621164650.16A093A414B@sparrow.telecommunity.com>
	<AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>
Message-ID: <hvou6g$ioa$1@dough.gmane.org>

On 6/21/2010 1:29 PM, Guido van Rossum wrote:

> Actually, the big problem with Python 2 is that if you mix str and
> unicode, things work or crash depending on whether any of the str
> objects involved contain non-ASCII bytes.
>
> If one API decides to upgrade to Unicode, the result, when passed to
> another API, may well cause a UnicodeError because not all arguments
> have had the same treatment.
>
>> Now, the APIs are neither safe nor aware -- if you pass bytes in, you get
>> unpredictable results back.
>
> This seems an overgeneralization of a particular bug. There are APIs
> that are strictly text-in, text-out. There are others that are
> bytes-in, bytes-out. Let's call all those *pure*. For some operations
> it makes sense that the API is *polymorphic*, with which I mean that
> text-in causes text-out, and bytes-in causes byte-out. All of these
> are fine.
>
> Perhaps there are more situations where a polymorphic API would be
> helpful. Such APIs are not always so easy to implement, because they
> have to be careful with literals or other constants (and even more so
> mutable state) used internally -- but it can be done, and there are
> plenty of examples in the stdlib.
>
> The real problem apparently lies in (what I believe is only a few
> rare) APIs that are text-or-bytes-in and always-text-out (or
> always-bytes-out). Let's call them *hybrid*. Clearly, mixing hybrid
> APIs in a stream of pure or polymorphic API calls is a problem,
> because they turn a pure or polymorphic overall operation into a
> hybrid one.
>
> There are also text-in, bytes-out or bytes-in, text-out APIs that are
> intended for encoding/decoding of course, but these are in a totally
> different class.
>
> Abstractly, it would be good if there were as few as possible hybrid
> APIs, many pure or polymorphic APIs (which it should be in a
> particular case is a pragmatic choice), and a limited number of
> encoding/decoding APIs, which should generally be invoked at the edges
> of the program (e.g., I/O).

Nice summary of part of the 'why' for Python3.

> I still believe that believe that the instances of bytes silently
> succeeding *some* of the time refers to specific bugs in specific
> APIs, either intentional because of misguided compatibility desires,
> or accidental in the haste of trying to convert the entire stdlib to
> Python 3 in a finite time.

I think http://bugs.python.org/issue5468 reports one aspect of haste, 
missing encoding and errors paramaters. But it has not gotten much 
attention.

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue Jun 22 02:46:03 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 20:46:03 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <hvp14a$pbv$1@dough.gmane.org>

On 6/21/2010 1:58 PM, Stephen J. Turnbull wrote:

> As for "Think Carefully About It Every Time", that is required only in
> Porting Programs That Mix Operation On Bytes With Operation On Str.

The 2.x anti-pattern

> If you write programs from scratch, however, the decode-process-encode
> paradigm quickly becomes second nature.

Except in this particular arena, it already should be to anyone reading 
this list. Decorate-sort-undecorate is another example of the same idea. 
Transform-compute-untransform is the basis of NP-complete theory. 
Frequency domain processing sandwiched between forward and reverse 
Fourier transforms is a third example. And so on.

-- 
Terry Jan Reedy


From jess.austin at gmail.com  Tue Jun 22 02:59:03 2010
From: jess.austin at gmail.com (Jess Austin)
Date: Mon, 21 Jun 2010 19:59:03 -0500
Subject: [Python-Dev] email package status in 3.X
Message-ID: <AANLkTimOA8SVdo90sXUtsUMwHLJervpLvKKF-AIniJ8l@mail.gmail.com>

On Mon, Jun 22, 2010 at 7:27:31 PM, Steven D'Aprano <steve at pearwood.info> wrote:
> On Tue, 22 Jun 2010 08:03:58 am Nick Coghlan wrote:
>> So, to boil down the ebytes idea, it is basically a request for a
>> second string type that holds an octet stream plus an encoding name,
>> rather than a Unicode character stream.
>
> Do any other languages have any equivalent to this ebtyes type?

Ruby seems to do this:

http://yokolet.blogspot.com/2009/07/design-and-implementation-of-ruby-m17n.html

I don't use ruby myself, and I'm probably missing some subtle flaws,
but the exposition at that link makes sense to me.

cheers,
Jess

From alexander.belopolsky at gmail.com  Tue Jun 22 03:05:54 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Jun 2010 21:05:54 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
	<AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
	<AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>
Message-ID: <AANLkTimwTtqA9wgaoZffU8QzHvbmyOAyu_aD3kxG6_jo@mail.gmail.com>

On Mon, Jun 21, 2010 at 6:39 PM, Paul Moore <p.f.moore at gmail.com> wrote:
..
> Yes, looks like it's a bug in the test. http://bugs.python.org/issue9055 raised.

I concur.  I've updated the issue with a proposed fix.  (The problem
is that proxy host names should have a '.' in them on OSX.)  I am
trying to decide whether the fix should be applied for all platforms
or conditionally for darwin.  Can someone test the fix on Windows?

From alexander.belopolsky at gmail.com  Tue Jun 22 03:08:19 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Jun 2010 21:08:19 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTimwTtqA9wgaoZffU8QzHvbmyOAyu_aD3kxG6_jo@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
	<AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
	<AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>
	<AANLkTimwTtqA9wgaoZffU8QzHvbmyOAyu_aD3kxG6_jo@mail.gmail.com>
Message-ID: <AANLkTin1rCySAEMxq2hAEeF8xF2t3Ibj89SG1KW-2C_T@mail.gmail.com>

Oh, I thought that was about http://bugs.python.org/issue8455 .

On Mon, Jun 21, 2010 at 9:05 PM, Alexander Belopolsky
<alexander.belopolsky at gmail.com> wrote:
> On Mon, Jun 21, 2010 at 6:39 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> ..
>> Yes, looks like it's a bug in the test. http://bugs.python.org/issue9055 raised.
>
> I concur. ?I've updated the issue with a proposed fix. ?(The problem
> is that proxy host names should have a '.' in them on OSX.) ?I am
> trying to decide whether the fix should be applied for all platforms
> or conditionally for darwin. ?Can someone test the fix on Windows?
>

From janssen at parc.com  Tue Jun 22 03:26:59 2010
From: janssen at parc.com (Bill Janssen)
Date: Mon, 21 Jun 2010 18:26:59 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTin1rCySAEMxq2hAEeF8xF2t3Ibj89SG1KW-2C_T@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
	<AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
	<AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>
	<AANLkTimwTtqA9wgaoZffU8QzHvbmyOAyu_aD3kxG6_jo@mail.gmail.com>
	<AANLkTin1rCySAEMxq2hAEeF8xF2t3Ibj89SG1KW-2C_T@mail.gmail.com>
Message-ID: <1180.1277170019@parc.com>

Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:

> Oh, I thought that was about http://bugs.python.org/issue8455 .
> 
> On Mon, Jun 21, 2010 at 9:05 PM, Alexander Belopolsky
> <alexander.belopolsky at gmail.com> wrote:
> > On Mon, Jun 21, 2010 at 6:39 PM, Paul Moore <p.f.moore at gmail.com> wrote:
> > ..
> >> Yes, looks like it's a bug in the test. http://bugs.python.org/issue9055 raised.
> >
> > I concur. ?I've updated the issue with a proposed fix. ?(The problem
> > is that proxy host names should have a '.' in them on OSX.) ?I am
> > trying to decide whether the fix should be applied for all platforms
> > or conditionally for darwin. ?Can someone test the fix on Windows?

Ah, thanks for tracking that one down.  I'll bet it's the same problem
I'm seeing with proxy authentication with bad credentials unexpectedly
succeeding.

Though, isn't that behavior of urllib.proxy_bypass another bug?

Bill


From alexander.belopolsky at gmail.com  Tue Jun 22 03:38:43 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Jun 2010 21:38:43 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FE4AF.80009@v.loewis.de>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk> <4C1FE4AF.80009@v.loewis.de>
Message-ID: <AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>

On Mon, Jun 21, 2010 at 6:16 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> The test_posix failure is a regression from 2.6 (but it only shows up on
>> some machines - it is caused by a fairly braindead implementation of a
>> couple of posix apis by Apple apparently).
>>
>> http://bugs.python.org/issue7900
>
> Ah, that one. I definitely think this should *not* block the release:

I agree that this is nowhere near being a release blocker, but I think
it would be nice to do something about it before the final release.

> a) there is no clear solution in sight. So if we wait for it resolved,
> ? it could take months until we get a 2.7 release.

The ideal solution will have to wait until Apple gets its act together
and fixed the problem on their end.  I would say "months" is an overly
optimistic time estimate for that.  However, the issue is a regression
from prior versions.  In 2.5 getgroups would truncate the list to 16
groups, but won't crash.  More importantly the 16 groups returned
would be correct per-process groups and not something immune to
setgroup changes.

I proposed a very simple fix:

http://bugs.python.org/file16326/no-darwin-ext.diff

which simply minimally reverts the change that introduced the regression.

> b) it's only about getgroups - a fairly minor API.

Agree, but failing regression test is an annoyance particularly in
this case where the diagnostic from the test is very vague.  Short of
fixing the problem, we can skip the failing test on OSX if getgroups
raises exception.

> c) IIUC, it only occurs to users which are member of more than 16
> ? groups - a fairly uncommon setup.
>

Unfortunately it is fairly common.  The default root account on OSX is
member of 18 groups.  Given that many os tests require root
privileges, people will run these tests as root.

From stephen at xemacs.org  Tue Jun 22 03:41:02 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 10:41:02 +0900
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <4C1FDB65.4020503@v.loewis.de>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
Message-ID: <87tyov3kup.fsf@uwakimon.sk.tsukuba.ac.jp>

"Martin v. L?wis" writes:

 > Still, the question would be whether any of these failures can manage to 
 > block a release.

Exactly.  Personally, I would say that in a volunteer-maintained
project, "Platform X is supported" means that "There is a bug that
seems to affect only Platform X" is a candidate for release blocker,
or other standardized action to get things fixed (call for volunteers,
etc).  That's a matter for agreement among the volunteers, not an
objective definition.  I think statements of support for certain
platforms are useful to users, and that they cause very little
additional friction or misunderstanding.  (Users who think that
"support" implies "support contract" are usually capable of finding an
excuse to ignore *any* disclaimer of warrantee; simply refusing to use
the word "support" won't save you from them!)

If a distinction needs to be made, we can say "Python *support* for a
platform does not imply that any particular issue will receive
concentrated attention from the core developers in any time frame.
When and how to address issues is up to the judgment of the
development community.  *Support contracts* are available from the
businesses listed on the Wiki under 'Python Consultancies' for those
who need a higher level of support."

From tjreedy at udel.edu  Tue Jun 22 03:46:27 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Mon, 21 Jun 2010 21:46:27 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>	<20100621145133.7F5333A404D@sparrow.telecommunity.com>	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
Message-ID: <hvp4ll$230$1@dough.gmane.org>

On 6/21/2010 2:46 PM, P.J. Eby wrote:

> This ignores the existence of use cases where what you have is text that
> can't be properly encoded in unicode.

I think it depends on what you mean by 'properly'. I will try to explain 
with English examples.

1. Unicode represents a finite set of characters and symbols and a few 
control or markup operators. The potential set is unbounded, so unicode 
includes a user area. I include use of that area in 'properly'. I kind 
of suspect that the statement above does not since any byte or short 
byte sequence that does not translate can instead use the user area.

2. Unicode disclaims direct representation of font and style 
information, leaving that to markup either in or out of the text stream. 
(It made an exception for japanese narrow and wide ascii chars, which I 
consider to essentially be duplicate font variations of the normal ascii 
codes.) Html uses both in-band and out-of-band (css) markup. Stripping 
markup information is a loss of information. If one wants it, one must 
keep it in one form or another.

I believe that some early editors like Wordstar used high-bit-set bytes 
for bold, underline, italic on and off. Assuming I have the example 
right, can Wordstar text be 'properly encoded in unicode'? If one 
insists that that mean replacement of each of the format markup chars 
with a single defined char in the Basic Multilingual Plane, then 'no'. 
If one allows replacement by <bold>, </bold>, and so on, then 'yes'.

3. Unicode disclaims direct representation of glyphic variants (though 
again, exceptions were made for asian acceptance). For example, in 
English, mechanically printed 'a' and 'g' are different from manually 
printed 'a' and 'g'. Representing both by the same codepoint, in itself, 
loses information. One who wishes to preserve the distinction must 
instead use a font tag or perhaps a <handprinted> tag. Similarly, older 
English had a significantly different glyph for 's', which looks more 
like a modern 'f'.

If IBM's EBCDIC had codes for these glyph variants, IBM might have 
insisted that unicode also have such so char for char round-tripping 
would be possible. It does not and unicode does not. (Wordstar and other 
1980s editor publishers were mostly defunct or weak and not in a 
position to make such demands.)

If one wants to write on the history of glyph evolution, say of latin 
chars, one much either number the variants 'e-0', 'e-1', etc, or resort 
to the user area. In either case, proprietary software would be needed 
to actually print the variations with other text.

> I know, it's a hard thing to wrap
> one's head around, since on the surface it sounds like unicode is the
> programmer's savior. Unfortunately, real-world text data exists which
> cannot be safely roundtripped to unicode,

I do not believe that. Digital information can always be recoded one way 
or another. As it is, the rules were bent for Japanese, in a way that 
they were not for English, to aid round-tripping of the major public 
encodings. I can, however, believe that there were private encodings for 
which round-tripping is more difficult. But there are also difficulties 
for old proprietary and even private English encodings.


-- 
Terry Jan Reedy


From stephen at xemacs.org  Tue Jun 22 04:06:36 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 11:06:36 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621160105.25ae602f@heresy>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
	<871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621160105.25ae602f@heresy>
Message-ID: <87sk4f3jo3.fsf@uwakimon.sk.tsukuba.ac.jp>

Barry Warsaw writes:

 > I'm still not sure ebytes solves the problem,

I don't see how it can.  If you have an encoding to stuff into ebytes,
you could just convert to Unicode and guarantee that all internal
string operations will succeed.  If you use ebytes instead, every
string operation has to be wrapped in "try ... except EBytesError", to
no gain that I can see.

If you don't have an encoding, then you just have bytes, which
strictly speaking shouldn't be operated on (in the sense of slicing,
dicing, or stir-frying) at all if you're in an environment where they
are a carrier for formatted information such as non-ASCII characters
or PNG images.

 > but it avoids one I'm most concerned about seeing proposed.  I
 > really really do not want to add encoding=blah arguments to
 > boatloads of function signatures.

Agreed.  But ebytes isn't a solution to that; it's a regression to one
of the hardest problems in Python 2.

OTOH, it seems to me that there's only one boatload to worry about.
That's the boatload containing protocol-less APIs, ie, Unix OS data
(names in the filesystem, content of environment variables).
Other platforms (Windows, Mac) are standardizing on protocols for
these things and enforcing them in the OS, and free Unices are going
to the convention that everything is non-normalized UTF-8.

What other boats are you worried about?


From alexander.belopolsky at gmail.com  Tue Jun 22 04:21:36 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 21 Jun 2010 22:21:36 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <1180.1277170019@parc.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
	<AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
	<AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>
	<AANLkTimwTtqA9wgaoZffU8QzHvbmyOAyu_aD3kxG6_jo@mail.gmail.com>
	<AANLkTin1rCySAEMxq2hAEeF8xF2t3Ibj89SG1KW-2C_T@mail.gmail.com>
	<1180.1277170019@parc.com>
Message-ID: <AANLkTilB0PJCqEhMfqAHnQrreYaO7c5gn0ZkL-9RzK5j@mail.gmail.com>

On Mon, Jun 21, 2010 at 9:26 PM, Bill Janssen <janssen at parc.com> wrote:
..
> Though, isn't that behavior of urllib.proxy_bypass another bug?

I don't know.  Ask Ronald.

From stephen at xemacs.org  Tue Jun 22 04:58:57 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 11:58:57 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621165611.GW5787@unaka.lan>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
Message-ID: <87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>

Toshio Kuratomi writes:

 > One comment here -- you can also have uri's that aren't decodable into their
 > true textual meaning using a single encoding.
 > 
 > Apache will happily serve out uris that have utf-8, shift-jis, and
 > euc-jp components inside of their path but the textual
 > representation that was intended will be garbled (or be represented
 > by escaped byte sequences).  For that matter, apache will serve
 > requests that have no true textual representation as it is working
 > on the byte level rather than the character level.

Sure.  I've never seen that combination, but I have seen Shift JIS and
KOI8-R in the same path.

But in that case, just using 'latin-1' as the encoding allows you to
use the (unicode) string operations internally, and then spew your
mess out into the world for someone else to clean up, just as using
bytes would.

 > So a complete solution really should allow the programmer to pass
 > in uris as bytes when the programmer knows that they need it.

Other than passing bytes into a constructor, I would argue if a
complete solution requires, eg, an interface that allows
urljoin(base,subdir) where the types of base and subdir are not
required to match, then it doesn't belong in the stdlib.  For stdlib
usage, that's premature optimization IMO.

The RFC says that URIs are text, and therefore they can (and IMO
should) be operated on as text in the stdlib.  It's not just a matter
of manipulating the URIs themselves, where working directly on bytes
will work just as well and and with the same string operations (as
long as everything is bytes).  It's also a question of API complexity
(eg, Barry's bugaboo of proliferation of encoding= parameters) and of
debugging (if URIs are internally str, then they will display sanely
in tracebacks and the interpreter).

The cases where URIs can't be sanely treated as text are garbage
input, and the stdlib should not try to provide a solution.  Just
passing in bytes and getting out bytes is GIGO.  Trying to do "some"
error-checking is going to be insufficient much of the time and overly
strict most of the rest of the time.  The programmer in the trenches
is going to need to decide what to allow and what not; I don't think
there are general answers because we know that allowing random URLs on
the web leads to various kinds of problems.  Some sites will need to
address some of them.

Note also that the "complete solution" argument cuts both ways.  Eg, a
"complete" solution should implement UTS 39 "confusables detection"[1]
and IDNA[2].  Good luck doing that with bytes!

If you *need* bytes (rather than simply trying to avoid conversion
overhead), you're in a hazmat handling situation.  Passing bytes in to
stdlib APIs here is the equivalent of carrying around kilograms of
fissionables in an open bucket.  While the Tokaimura comparison is
hyperbole, it can't be denied that use of bytes here shortcuts a lot
of processing strongly suggested by the RFCs, and prevents use of
various programming conveniences (such as reasonable display of URI
values in debugging).  Does the efficiency really justify including
that in the stdlib?  I dunno, I'm not a web programmer in the
trenches.  But I take my cue from MvL and MAL who don't seem real
enthusiastic about this.

And as Martin says, there is as yet no evidence offered that the
overhead of conversion is a general problem.


Footnotes: 
[1]  http://www.unicode.org/reports/tr39/

[2]  http://www.rfc-editor.org/rfc/rfc3490.txt

From stephen at xemacs.org  Tue Jun 22 06:15:19 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 13:15:19 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTinQ_d_vaHBw5IKUYY9qgjqOfFy4XCzC0DYztr9n@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTinQ_d_vaHBw5IKUYY9qgjqOfFy4XCzC0DYztr9n@mail.gmail.com>
Message-ID: <87ocf33dpk.fsf@uwakimon.sk.tsukuba.ac.jp>

Robert Collins writes:

 > Perhaps you mean 3986 ? :)

Thank you for the correction.

 > > ? ?A URI is an identifier consisting of a sequence of characters
 > > ? ?matching the syntax rule named <URI> in Section 3.
 > >
 > > (where the phrase "sequence of characters" appears in all ancestors I
 > > found back to RFC 1738), and
 > 
 > Sure, ok, let me unpack what I meant just a little. An abstract URI is
 > neither unicode nor bytes per se - see section 1.2.1 " A URI is a
 > sequence of characters from a very limited set: the letters of the
 > basic Latin alphabet, digits, and a few special characters. "

My position is that this describes the network protocol, not the
abstract URI.  It in no way suggests that uri-encoded forms should be
handled internally.  And the RFC explicitly says this is text, and
therefore sanctions the user- and programmer-friendly practice of
doing internal processing as text.

Note that in a hypothetical bytes-oriented API

    base = convert_uri_to_wire_format('http://www.example.org/')
    formuri = uri_join(base,b'home/steve/public_html')

the bytes literal b'/home/steve/public_html' clearly is intended as
readable text.  This is mixing types in the programmer's mind, even
though base is internally in bytes format and the relative URI is also
in bytes format.  This is un-Pythonic IMO.

 > URI interpretation is fairly strictly separated between producers and
 > consumers. A consumer can manipulate a url with other url fragments -
 > e.g. doing urljoin. But it needs to keep the url as a url and not try
 > to decode it to a unicode representation.
-------------- next part --------------

Unfortunately, outside of Kansas and Canberra, it don't work that
way.  How do you propose to uri_join base as above and
'/home/?????/public_html'?  Encoding and/or decoding must be done
somewhere, and it would be damn unfriendly to make the browser user do
it!

In the bytes-oriented API, the programmer must be continually making
decisions about whether and how to handle non-ASCII components from
"outside" (or, more likely, cursing the existence of the damned
foreigners, and then ignoring the possibility ... let them eat
UnicodeException!)
-------------- next part --------------

 > As an example, if I give the uri "http://server/%c3%83", rendering
 > that as http://server/? is able to lead to transcription errors and
 > reinterpretation problems unless you know - out of band - that the
 > server is using utf8 to encode. Conversely if someone enters in
 > http://server/? in their browser window, choosing utf8 or their local
 > encoding is quite arbitrary and able to not match how the server would
 > represent that resource.

Sure.  Using bytes doesn't solve either problem.  It just allows you
to wash your hands of it and pass it on to someone else, who probably
has even less information than you do.

Eg, in the case of passing the uri "http://server/%c3%83" to someone
else without telling them the encoding means that effectively they're
limited to ASCII if they want to append meaningful relative paths
without guessing the encoding.

In the case of the user entering "http://server/?", you have to do
*something* to produce bytes eventually.  When was the last time you
typed "%c3%83" at the end of a URL in a browser address field?

 > > ? ?2. ?Characters
 > >
 > > ? ?The URI syntax provides a method of encoding data, presumably for
 > > ? ?the sake of identifying a resource, as a sequence of characters.
 > > ? ?The URI characters are, in turn, frequently encoded as octets for
 > > ? ?transport or presentation. ?This specification does not mandate any
 > > ? ?particular character encoding for mapping between URI characters
 > > ? ?and the octets used to store or transmit those characters. ?When a
 > > ? ?URI appears in a protocol element, the character encoding is
 > > ? ?defined by that protocol; without such a definition, a URI is
 > > ? ?assumed to be in the same character encoding as the surrounding
 > > ? ?text.
 > 
 > Thats true, but its been taken out of context; the set of characters
 > permitted in a URL is a strict subset of characters found in  ASCII;

No.  Again, you're confounding "the URL" with its network format.
There's no question that the network format is in bytes, and before
putting the URI into a wire protocol, you need to encode non-URI
characters.  However, the abstract URI is text, and may not even be
represented by octets or Unicode at all (eg, represented by carbon
residue on recycled wood pulp).

 > See also the section on comparing URL's - Unicode isn't at all relevant.

Not to the RFC, which talks about *characters* and gives examples that
imply transcoding (eg, between EBCDIC and UTF-16), see the section you
cite.  However, Unicode is the canonical representation of text inside
Python, and therefore TOOWTDI for URL comparison in Python.

Thank you for that killer argument for my position; I hadn't thought
of it.

 > I wish it would. The problem is not in Python here though - and
 > casually handwaving will exacerbate it, not fix it. 

Using bytes "because we just don't know" is exactly casual handwaving.
Well, maybe not casual; I'm aware that many programmers are driven to
it by the recognition that only the extremes (all bytes vs. all text)
make sense, and they choose bytes for efficiency reasons.

I believe that focus on efficiency is un-Pythonic; that in Python 3
text should be chosen (in the stdlib) because it makes writing
programs more fun (you can use literal notation for non-ASCII string
constants, for example) and debuggable.

Sure, in some cases you'll need to punt to 'latin-1' (ie, 'binary') or
perhaps PEP 383 lone surrogates (this would require special handling
to get reasonably friendly presentation to users and debuggers, I
suppose), but for the many cases where you know that everything is in
the same encoding life is a lot better.  And of course I have no
objection to an additional API for efficiency for those who want it,
and maybe that even belongs in the stdlib.  But IMO the TOOWTDI should
use text (ie, Python 3 str = Unicode) by default.

 > Modelling URL's as string like things is great from a convenience
 > perspective, but, like file paths, they are much more complex
 > difficult.

No.  Like file paths, it is the key to any real solution to the
problem.  Users, both server admins, URN specifiers, and browsers,
think about the URI as text and expect inputting text to work.  As
does the RFC.  Machines, on the other hand, think of both as bytes (at
least in the general Unix world).  It is the programmer's job to do
the best she can to identify the correct encoding to bridge the
mismatch.  She can abdicate that job, of course, but if she chooses
*not* to abdicate, (1) treating the URI as text encourages her to
confront the issue early, and (2) ensures that to the extent possible
the URI will maintain its quality of intelligible text.

With bytes, your only sane choice is to abdicate.

N.B.  STD 66 refrains from redefining HTTP URLs to be UTF-8 because
*it would not work*.  Practically, Nippon Tel & Tel will continue to
use Shift JIS URIs for cellphone-oriented sites because its handset
browsers only understand Shift JIS (or some such nonsense).

 > If Unicode was relevant to HTTP,

Again, Unicode is relevant not because of the wire protocols, but
because of Python's and because of the intent of the RFCs.

 > I'd agree, but its not; we should put fragile heuristics at the
 > outer layer of the API and work as robustly and mechanically as
 > possible at the core. Where we need to guess, we need worker
 > functions that won't guess at all - for the sanity of folk writing
 > servers and protocol implementations.

A worker function that doesn't guess must error in the absence of
out-of-band information about the encoding.  This is true whether you
represent URIs internally as bytes or as text.  Refusing to error
constitutes a guess, because in a bytes-internal system, eventually
text from outside will find its way into the system, and must be
encoded to bytes, and in the case of a text-internal system, obviously
bytes from outside are coming in and must be decoded to text.


From stephen at xemacs.org  Tue Jun 22 07:17:10 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 14:17:10 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621191432.710993A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621114307.48735698@heresy>
	<871vc045sl.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621191432.710993A404D@sparrow.telecommunity.com>
Message-ID: <87mxun3auh.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:

 > In Kagoshima, you'd use pass in an ebytes with your encoding to a
 > stdlib API, and *get back an ebytes with the right encoding*,
 > rather than an (incorrect and useless) unicode object which has
 > lost data you need.

How does the stdlib do that?  Unless it guesses which encoding for
Japanese is being used?  And even if this ebytes uses Shift JIS, what
makes that the "right" encoding for anything?

On the other hand, I know when *I* need some encoding, and when I
figure it out I will store it in an appropriate place in my program.
The problem is that for some programs it is not unlikely that I will
see all of Shift JIS, EUC-JP, ISO-2022-JP, UTF-8, and UTF-16, and on a
very bad day, RFC 2047, GB 2312, and Big5, too, used to encode
Japanese.  It's not totally unlikely for a browser to send URLs to a
server expecting UTF-8 to recover a message/rfc822 object containing
ISO-2022-JP in the mail header and EUC-JP in the body.

So I need to know which encoding was used by the server that sent the
reply, but the ebytes can't tell me that if it fishes an URL in EUC-JP
out of the message body.  I need to convert that URL to UTF-8, or most
servers will 404.

 > But this is not the case at all, for use cases where "no, really, you 
 > *have to* work with bytes-encoded text streams".  The mere release of 
 > Python 3.x will not cause all the world's applications, libraries, 
 > and protocols to suddenly work with unicode, where they did not before.

Sure.  That's what .encode() and .decode() are for.  The problem is
what to do when you don't know what to put in the parentheses, and I
can't think of a use case offhand where ebytes(stuff,'garbage')
does better than PEP 383-enabled str for:

 > Being explicit about the encoding of the bytes you're flinging
 > around is actually an *increase* in specificity, explicitness,
 > robustness, and error-checking ability over the status quo for
 > either 2.x *or* 3.x...  *and* it improves these qualities for
 > essentially *all* string-handling code, without requiring that code
 > to be rewritten to do so.

A well-spoken piece.  But, you see, most of those encodings are *only*
interesting so that you can transcode characters to the encoding of
interest.  What's the e.o.i.?  That is easily found in the context or
has an obvious default, if you're lucky, or otherwise a hard problem
that ebytes does nothing to help solve as far as I can see.

Cf. Robert Collins' post
<AANLkTinQ_d_vaHBw5IKUYY9qgjqOfFy4XCzC0DYztr9n at mail.gmail.com>, where
he makes it quite explicit that a bytes interface is all about punting
in the face of missing encoding information.

 > >and (2) you really want this under control of higher level objects
 > >that have access to some knowledge of the environment, rather than
 > >the lowest level.
 > 
 > This proposal actually has such a higher-level object: an 
 > ebytes.

I don't see how that can be true.  An ebytes is a very low-level
object that has no idea whether its encoding is interesting (eg, the
one that an RFC or a server specifies), or a technical detail of use
only until the ebytes is decoded, then can be thrown away.

I just don't see, in the case where there is a real encoding in the
ebytes, what harm is done by decoding the ebytes to str.  If context
indicates that the encoding is an interesting one (eg, it should be
the default for encoding on output), then you want to save that in an
appropriate place that preserves not just the encoding itself, but the
context that gives it its importance.


From glyph at twistedmatrix.com  Tue Jun 22 07:22:22 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Tue, 22 Jun 2010 01:22:22 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621181750.267933A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>
	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>
	<20100621164650.16A093A414B@sparrow.telecommunity.com>
	<AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>
	<20100621181750.267933A404D@sparrow.telecommunity.com>
Message-ID: <C56762CD-C47C-4153-BAED-32B6786BDE5C@twistedmatrix.com>

On Jun 21, 2010, at 2:17 PM, P.J. Eby wrote:

> One issue I remember from my "enterprise" days is some of the Asian-language developers at NTT/Verio explaining to me that unicode doesn't actually solve certain issues -- that there are use cases where you really *do* need "bytes plus encoding" in order to properly express something.

The thing that I have heard in passing from a couple of folks with experience in this area is that some older software in asia would present characters differently if they were originally encoded in a "japanese" encoding versus a "chinese" encoding, even though they were really "the same" characters.

I do know that Han Unification is a giant political mess (<http://en.wikipedia.org/wiki/Han_unification> makes for some interesting reading), but my understanding is that it has handled enough of the cases by now that one can write software to display asian languages and it will basically work with a modern version of unicode.  (And of course, there's always the private use area, as Stephen Turnbull pointed out.)

Regardless, this is another example where keeping around a string isn't really enough.  If you need to display a japanese character in a distinct way because you are operating in the japanese *script*, you need a tag surrounding your data that is a hint to its presentation.  The fact that these presentation hints were sometimes determined by their encoding is an unfortunate historical accident.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/34948128/attachment.html>

From glyph at twistedmatrix.com  Tue Jun 22 07:31:16 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Tue, 22 Jun 2010 01:31:16 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>

On Jun 21, 2010, at 10:58 PM, Stephen J. Turnbull wrote:

> The RFC says that URIs are text, and therefore they can (and IMO
> should) be operated on as text in the stdlib.


No, *blue* is the best color for a shed.

Oops, wait, let me try that again.

While I broadly agree with this statement, it is really an oversimplification.  An URI is a structured object, with many different parts, which are transformed from bytes to ASCII (or something latin1-ish, which is really just bytes with a nice face on them) to real, honest-to-goodness text via the IRI specification: <http://tools.ietf.org/html/rfc3987>.

> Note also that the "complete solution" argument cuts both ways.  Eg, a
> "complete" solution should implement UTS 39 "confusables detection"[1]
> and IDNA[2].  Good luck doing that with bytes!

And good luck doing that with just characters, too.  You need a parsed representation of the URI that you can encode different parts of in different ways.  (My understanding is that you should only really implement confusables detection in the netloc... while that may be a bogus example, you're certainly only supposed to do IDNA in the netloc!)

You can just call urlsplit() all over the place to emulate this, but this does not give you the ability to go back to the original bytes, and thereby preserve things like brokenly-encoded segments, which seems to be what a lot of this hand-wringing is about.

To put it another way, there is no possible information-preserving string or bytes type that will make everyone happy as a result from urljoin().  The only return-type that gives you *everything* is "URI".

> just using 'latin-1' as the encoding allows you to
> use the (unicode) string operations internally, and then spew your
> mess out into the world for someone else to clean up, just as using
> bytes would.

This is the limitation that everyone seems to keep dancing around.  If you are using the stdlib, with functions that operate on sequences like 'str' or 'bytes', you need to choose from one of three options:

  1. "decode" everything to latin1 (although I prefer to call it "charmap" when used in this way) so that you can have some mojibake that will fool a function that needs a unicode object, but not lose any information about your input so that it can be transformed back into exact bytes (and be very careful to never pass it somewhere that it will interact with real text!),
  2. actually decode things to an appropriate encoding to be displayed to the user and manipulated with proper text-manipulation tools, and throw away information about the bytes,
  3. keep both the bytes and the characters together (perhaps in a data structure) so that you can both display the data and encode it in situationally-appropriate ways.

The stdlib as it is today is not going to handle the 3rd case for anyone.  I think that's fine; it is not the stdlib's job to solve everyone's problems.  I've been happy with it providing correctly-functioning pieces that can be used to build more elaborate solutions.  This is what I meant when I said I agree with Stephen's first point: the stdlib *should* just keep operating entirely on strings, because URIs are defined, by the spec, to be sequences of ASCII characters.  But that's not the whole story.

PJE's "bstr" and "ebytes" proposals set my teeth on edge.  I can totally understand the motivation for them, but I think it would be a big step backwards for python 3 to succumb to that temptation, even in the form of a third-party library.  It is really trying to cram more information into a pile of bytes than truly exists there.  (Also, if we're going to have encodings attached to bytes objects, I would very much like to add "JPEG" and "FLAC" to the list of possibilities.)

The real tension there is that WSGI is desperately trying to avoid defining any data structures (i.e. classes), while still trying to work with structured data.  An URI class with a 'child' method could handily solve this problem.  You could happily call IRI(...).join(some bytes).join(some text) and then just say "give me some bytes, it's time to put this on the network", or "give me some characters, I have to show something to the user", or even "give me some characters appropriate for an 'href=' target in some HTML I'm generating" - although that last one could be left to the HTML generator, provided it could get enough information from the URI/IRI object's various parts itself.

I don't mean to pick on WSGI, either.  This is a common pain-point for porting software to 3.x - you had a string, it kinda worked most of the time before, but now you need to keep track of text too and the functions which seemed to work on bytes no longer do.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/80fcaab6/attachment.html>

From stephen at xemacs.org  Tue Jun 22 07:28:57 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 14:28:57 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>
Message-ID: <87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>

Michael Urman writes:

 > It is somewhat troublesome that there doesn't appear to be an obvious
 > built-in idempotent-when-possible function that gives back the
 > provided bytes/str,

If you want something idempotent, it's already the case that
bytes(b'abc') => b'abc'.  What might be desirable is to make
bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII
(or maybe ISO 8859/1).

Unfortunately, str(b'abc') already does work, but

steve at uwakimon ~ $ python3.1
Python 3.1.2 (release31-maint, May 12 2010, 20:15:06) 
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> str(b'abc')
"b'abc'"
>>> 

Oops.  You can see why that probably "should" be the case.

From a.badger at gmail.com  Tue Jun 22 07:50:40 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Tue, 22 Jun 2010 01:50:40 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100622055040.GE5787@unaka.lan>

On Tue, Jun 22, 2010 at 11:58:57AM +0900, Stephen J. Turnbull wrote:
> Toshio Kuratomi writes:
> 
>  > One comment here -- you can also have uri's that aren't decodable into their
>  > true textual meaning using a single encoding.
>  > 
>  > Apache will happily serve out uris that have utf-8, shift-jis, and
>  > euc-jp components inside of their path but the textual
>  > representation that was intended will be garbled (or be represented
>  > by escaped byte sequences).  For that matter, apache will serve
>  > requests that have no true textual representation as it is working
>  > on the byte level rather than the character level.
> 
> Sure.  I've never seen that combination, but I have seen Shift JIS and
> KOI8-R in the same path.
> 
> But in that case, just using 'latin-1' as the encoding allows you to
> use the (unicode) string operations internally, and then spew your
> mess out into the world for someone else to clean up, just as using
> bytes would.
> 
This is true.  I'm giving this as a real-world counter example to the
assertion that URIs are "text".  In fact, I think you're confusing things
a little by asserting that the RFC says that URIs are text.  I'll address
that in two sections down.

>  > So a complete solution really should allow the programmer to pass
>  > in uris as bytes when the programmer knows that they need it.
> 
> Other than passing bytes into a constructor, I would argue if a
> complete solution requires, eg, an interface that allows
> urljoin(base,subdir) where the types of base and subdir are not
> required to match, then it doesn't belong in the stdlib.  For stdlib
> usage, that's premature optimization IMO.
> 
I'll definitely buy that.  Would urljoin(b_base, b_subdir) => bytes and
urljoin(u_base, u_subdir) => unicode be acceptable though?  (I think, given
other options, I'd rather see two separate functions, though.  It seems more
discoverable and less prone to taking bad input some of the time to have two
functions that clearly only take one type of data apiece.)

> The RFC says that URIs are text, and therefore they can (and IMO
> should) be operated on as text in the stdlib.

If I'm reading the RFC correctly, you're actually operating on two different
levels here.  Here's the section 2 that you quoted earlier, now in its
entirety::
2.  Characters

   The URI syntax provides a method of encoding data, presumably for the
   sake of identifying a resource, as a sequence of characters.  The URI
   characters are, in turn, frequently encoded as octets for transport or
   presentation.  This specification does not mandate any particular
   character encoding for mapping between URI characters and the octets used
   to store or transmit those characters.  When a URI appears in a protocol
   element, the character encoding is defined by that protocol; without such
   a definition, a URI is assumed to be in the same character encoding as
   the surrounding text.

   The ABNF notation defines its terminal values to be non-negative integers
   (codepoints) based on the US-ASCII coded character set [ASCII].  Because
   a URI is a sequence of characters, we must invert that relation in order
   to understand the URI syntax.  Therefore, the integer values used by the
   ABNF must be mapped back to their corresponding characters via US-ASCII
   in order to complete the syntax rules.

   A URI is composed from a limited set of characters consisting of digits,
   letters, and a few graphic symbols.  A reserved subset of those
   characters may be used to delimit syntax components within a URI while
   the remaining characters, including both the unreserved set and those
   reserved characters not acting as delimiters, define each component's
   identifying data.

So here's some data that matches those terms up to actual steps in the
process::

  # We start off with some arbitrary data that defines a resource.  This is
  # not necessarily text.  It's the data from the first sentence:
  data = b"\xff\xf0\xef\xe0"

  # We encode that into text and combine it with the scheme and host to form
  # a complete uri.  This is the "URI characters" mentioned in section #2.
  # It's also the "sequence of characters mentioned in 1.1" as it is not
  # until this point that we actually have a URI.
  uri = b"http://host/" + percentencoded(data)
  # 
  # Note1: percentencoded() needs to take any bytes or characters outside of
  # the characters listed in section 2.3 (ALPHA / DIGIT / "-" / "." / "_"
  # / "~") and percent encode them.  The URI can only consist of characters
  # from this set and the reserved character set (2.2).
  #
  # Note2: in this simplistic example, we're only dealing with one piece of
  # data.  With multiple pieces, we'd need to combine them with separators,
  # for instance like this:
  # uri = b'http://host/' + percentencoded(data1) + b'/'
  # + percentencoded(data2)
  #
  # Note3: at this point, the uri could be stored as unicode or bytes in
  # python3.  It doesn't matter.  It will be a subset of ASCII in either
  # case.

  # Then we take this and encode it for presentation inside of a data
  # file.  If we're saving in any encoding that has ASCII as a subset and we
  # had bytes returned from the previous step, all we need to do is save to
  # a file.  If we had unicode from the previous step, we need to transform
  # to the encoding we're using and output it.
  u_uri.encode('utf8')

With all this in mind... URIs are text according to the RFC if you want to
deal with URIs that are percent encoded.  In other words, things like this::
  http://host/%ff%f0%ef%e0

If you want to deal with things like this::
  http://host/caf?

Then you are going one step further; back to the orginal data that was
encoded in the RFC.  At that point you are no longer dealing with the
sequence of characters talked about in the RFC.  You are dealing with data
which may or may not be text.

As Robert Collins says, this is bytes by definition which I pretty much
agree with.  It's very very convenient to work with this data as text most
of the time but the RFC does not mandate that it is text so operating on it
as bytes is perfectly reasonable.

> It's not just a matter
> of manipulating the URIs themselves, where working directly on bytes
> will work just as well and and with the same string operations (as
> long as everything is bytes).  It's also a question of API complexity
> (eg, Barry's bugaboo of proliferation of encoding= parameters) and of
> debugging (if URIs are internally str, then they will display sanely
> in tracebacks and the interpreter).

The proliferation of encoding I agree is a thing that is ugly.  Although, if
I'm thinking correctly, that only matters when you want to allow mixing
bytes and unicode, correct?  One of these cases:

* I take in some mix of parameters with at least one unicode and output bytes
* I take in some mix of parameters with at least one bytes and output unicode
* I take in either bytes or unicode and transform them internally to the
  other type before operating on them.  Then I transform the output to the
  input type before returning.

For debugging, I'm either not understanding or you're wrong.  If I'm given
an arbitrary sequence of bytes how do I sanely store them as str internally?
If I transform them using an encoding that anticipates the full range of
bytes I may be able to display some representation of them but it's not
necessarily the sanest method of display (for instance, if I know that path
element 1 is always going to be a utf8 encoded string and path element 2 is
always shift-jis encoded, and path element 3 is binary data, I could
construct a much saner display method than treating the whole thing as
latin1).

> The cases where URIs can't be sanely treated as text are garbage
> input, and the stdlib should not try to provide a solution.  Just
> passing in bytes and getting out bytes is GIGO.  Trying to do "some"
> error-checking is going to be insufficient much of the time and overly
> strict most of the rest of the time.  The programmer in the trenches
> is going to need to decide what to allow and what not; I don't think
> there are general answers because we know that allowing random URLs on
> the web leads to various kinds of problems.  Some sites will need to
> address some of them.
> 
What is your basis for asserting that URIs that aren't sanely treated as
text are garbage?  It's definitely not in the RFC.

> Note also that the "complete solution" argument cuts both ways.  Eg, a
> "complete" solution should implement UTS 39 "confusables detection"[1]
> and IDNA[2].  Good luck doing that with bytes!
> 
Note that IDNA and confusables detection operate on a different portion of
the uri than the need for bytes.  Those operate on the domain name (looks
like it's called the authority in the rfc) whereas bytes are useful for the
path, query, and fragment portions.

Note:  I'm not sure precisely what Philip is looking to do but the little
I've read sounds like its contrary to the design principles of the python3
unicode handling redesign.  I'm stating my reading of the RFC not to defend
the use case Philip has, but because I think that the outlook that non-text
uris (before being percentencoded) are violations of the RFC is wrong and
will lead to interoperability problems/warts(since you could turn them into
latin1 and from there into bytes and from there into the proper values) if
allowed to predominate the thinking.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/321573c2/attachment-0001.pgp>

From raymond.hettinger at gmail.com  Tue Jun 22 08:21:51 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 21 Jun 2010 23:21:51 -0700
Subject: [Python-Dev] UserDict in 2.7
Message-ID: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>

There's an entry in whatsnew for 2.7 to the effect of "The UserDict class is now a new-style class".

I had thought there was a conscious decision to not change any existing classes from old-style to new-style.  IIRC, Martin had championed this idea and had rejected all of proposals to make existing classes inherit from object.


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/f1d3578d/attachment.html>

From ronaldoussoren at mac.com  Tue Jun 22 08:39:19 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Tue, 22 Jun 2010 08:39:19 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <1277151926.3369.6.camel@localhost.localdomain>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk>
	<1277151926.3369.6.camel@localhost.localdomain>
Message-ID: <30F79991-F933-44C6-A884-5A8D5671DB8C@mac.com>


On 21 Jun, 2010, at 22:25, Antoine Pitrou wrote:

> Le lundi 21 juin 2010 ? 21:13 +0100, Michael Foord a ?crit :
>> 
>> If OS X is a supported and important platform for Python then fixing all 
>> problems that it reveals (or being willing to) should definitely not be 
>> a pre-requisite of providing a buildbot (which is already a service to 
>> the Python developer community). Fixing bugs / failures revealed by 
>> Bill's buildbot is not fixing them "for Bill" it is fixing them for Python.
> 
> I didn't say it was a prerequisite. I was merely pointing out that when
> platform-specific bugs appear, people using the specific platform should
> be helping if they want to actually encourage the fixing of these bugs.
> 
> OS X is only "a supported and important platform" if we have dedicated
> core developers diagnosing or even fixing issues for it (like we
> obviously have for Windows and Linux). Otherwise, I don't think we have
> any moral obligation to support it.

I look into and fix OSX issues, but do so in my spare time. This means it can take a while until I get around doing so.

Ronald

P.S. Please file bugs for issues on OSX and set the compontent to Macintosh instead of discussing them on python-dev. I don't read python-dev on a daily basis almost missed this thread.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/e2587f32/attachment.bin>

From raymond.hettinger at gmail.com  Tue Jun 22 08:47:46 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Mon, 21 Jun 2010 23:47:46 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>
Message-ID: <89AE7ED6-FB94-45DA-9432-7FCBA25A56BF@gmail.com>


On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
>   This is a common pain-point for porting software to 3.x - you had a string, it kinda worked most of the time before, but now you need to keep track of text too and the functions which seemed to work on bytes no longer do.

Thanks Glyph.  That is a nice summary of one kind of challenge facing programmers.


Raymond

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100621/4a596da9/attachment.html>

From stephen at xemacs.org  Tue Jun 22 08:49:01 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 15:49:01 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
Message-ID: <87k4pr36le.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:

 > I know, it's a hard thing to wrap one's head around, since on the
 > surface it sounds like unicode is the programmer's savior.

I don't need to wrap my head around it.  It's been deeply embedded,
point first, and the nasty barbs ensure that I have no desire to pull
it back out.

To wit, I've been dealing with Japanese encoding issues on a daily
basis for 20 years, and I'm well aware that programmers have several
good reasons (and a lot more bad ones) for avoiding them, and even for
avoiding Unicode when they must deal with encodings at all.  I don't
think any of the good reasons have been offered here yet, that's all.

 > Unfortunately, real-world text data exists which cannot be safely
 > roundtripped to unicode, and must be handled in "bytes with
 > encoding" form for certain operations.

Or "Unicode with encoding" form.  See below for why this makes sense in
the context of Python.

 > I personally do not have to deal with this *particular* use case any 
 > more -- I haven't been at NTT/Verio for six years now.

As mentioned, I have a bit of understanding of the specific problems
of Japanese-language computing.  In particular, roundtripping Japanese
from *any* encoding to *any other* encoding is problematic, because
the national standards provide a proper subset of the repertoire
actually used by the Japanese people.  (Even JIS X 0213.)

 > My current needs are simpler, thank goodness.  ;-)  However, they 
 > *do* involve situations where I'm dealing with *other* 
 > encoding-restricted legacy systems, such as software for interfacing 
 > with the US Postal Service that only works with a restricted subset 
 > of latin1, while receiving mangled ASCII from an ecommerce provider, 
 > and storing things in what's effectively a latin-1 database.

Yes, I know of similar issues in other applications.  For example, TeX
error messages do not respect UTF-8 character boundaries, so Emacs has
to handle them specially (basically a mechanism similar in spirit to
PEP 383 is used).

 > Being able to easily assert what kind of bytes I've got would
 > actually let me catch errors sooner, *if* those assertions were
 > being checked when different kinds of strings or bytes were being
 > combined.  i.e., at coercion time).

I see that this would make life a little easier for you in maintaining
without refactoring.  I'd say it's a kludge, but without a full list
of requirements I'm in no position to claim any authority <wink>.  Eg,
for a non-kludgey suggestion, how about defining a codec which takes
Latin-1 bytes, checks (with error on failure) for the restricted
subset, and converts to str?  Then you can manipulate these things as
str with abandon internally.  Finally you get another check in the
outgoing codec which converts from str to "effective Latin-1 bytes",
however that is defined.

But OK, maybe I'm just being naive.  You need this unlovely artifice
so you can put in asserts in appropriate places.  Now, does it belong
in the stdlib?

It seems to me that in the case of Japanese roundtripping, *most* of
the time encoding back to a standard Japanese encoding will work.  If
you run into one of the problematic characters that JIS doesn't allow
but Japanese like to use because they prefer the glyph to the
JIS-standard glyph, you get an occasional error on encoding to a
standard Japanese encoding, which you handle specially with a database
of such characters.  Knowing the specific encoding originally used
*normally does not help unless you're replying to that person and
**only** that person*, because the extended repertoires vary widely
and the only standard is Japanese.  I conclude ebytes does *no* good
here.

For the ecommerce/USPS case, well, actually you need special-purpose
encodings anyway (ISTM).  'latin-1' loses, the USPS is allergic to
some valid 'latin-1' characters.  'ascii' loses, apparently you need
some of the Latin-1 repertoire, and anyway AIUI the ecommerce provider
munges the ASCII.  So what does ebytes actually buy you here, unless
you write the codecs?  If you've got the codecs, what additional
benefit do you get from ebytes?

Note that you would *also* need to do explicit transcoding anyway if
you were dealing with Japan Post instead of the USPS, although I grant
your code is probably general enough to deal with Deutsche Telecom
(but the German equivalent of your ecommerce provider probably has its
own ways of munging Latin-1).  I conclude that there may be genuine
benefits to ebytes here, but they're probably not general enough to
put in the stdlib (or the Python language).

 > Which works if and only if your outputs are truly unicode-able.

With PEP 383, they always are, as long as you allow Unicode to be
decoded to the same garbage your bytes-based program would have
produced anyway.

 > If you work with legacy systems (e.g. those Asian email clients and
 > US postal software), you are really working with a *character set*,
 > not unicode,

I think you're missing something.  Namely, Unicode is a standard for
handling character objects as integers, and a registry for mapping
characters to integers.  It includes over 100,000 points for making up
your own mappings, and recent Python also provides (as an internal
extension) for embedding non-characters in a str.

Unicode does not define a repertoire, however.  That's up to the
application, and Python 2+ provides a convenient way to restrict
repertoires by defining special purpose codecs in Python.

It is then up to the program to ensure that all candidates claiming to
be text pass through the cleansing fire of a codec before being
allowed into the Pure Land of str.  This can be something of a
problem; there are a few ways for textual data to get into Python, and
not all of them were obvious to me.  But this problem would be even
worse for mechanisms like ebytes, where it's up to the programmer to
decide which things are put into ebytes.

 > and so putting your data in unicode form is actually *wrong*
 > -- an expedient lie.
 > 
 > Heresy, I know, but there you go.  ;-)

It's not heresy, it's simply assuming a restriction on use of Unicode
that just isn't true.  It *is* true that mapping the data to Unicode
according to some encoding is not always sufficient.  It *is* often
the case that further information must be provided to ensure semantic
correctness.  However, given the mapping (== properly defined codecs),
roundtripping *is* always possible, at least up to the size of private
space, which is big enough to hold the Post Office's repertoire, for
sure.  And that mapping is a Python object which will fit into a
variable for later use.


From stephen at xemacs.org  Tue Jun 22 09:33:53 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 16:33:53 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>
Message-ID: <87hbkv34im.fsf@uwakimon.sk.tsukuba.ac.jp>

Glyph Lefkowitz writes:
 > On Jun 21, 2010, at 10:58 PM, Stephen J. Turnbull wrote:

 > > Note also that the "complete solution" argument cuts both ways.  Eg, a
 > > "complete" solution should implement UTS 39 "confusables detection"[1]
 > > and IDNA[2].  Good luck doing that with bytes!
 > 
 > And good luck doing that with just characters, too.

I agree with you, sorry.  I meant to cast doubt on the idea of
complete solutions, or at least claims that completeness is an excuse
for putting it in the stdlib.

 > This is the limitation that everyone seems to keep dancing around.
 > If you are using the stdlib, with functions that operate on
 > sequences like 'str' or 'bytes', you need to choose from one of
 > three options: 

There's a *fourth* way: specially designed codecs to preserve as much
metainformation as you need, while always using the str format
internally.  This can be done for at least 100,000 separate
(character, encoding) pairs by multiplexing into private space with an
auxiliary table of encodings and equivalences.  That's probably
overkill.  In many cases, adding simple PEP 383 mechanism (to preserve
uninterpreted bytes) might be enough though, and that's pretty
plausible IMO.


From lesni.bleble at gmail.com  Tue Jun 22 11:08:56 2010
From: lesni.bleble at gmail.com (lesni bleble)
Date: Tue, 22 Jun 2010 11:08:56 +0200
Subject: [Python-Dev] adding new function
Message-ID: <AANLkTik31g_9oKU9DjSvihFEpu6PlAOf9Vm44FxesDOw@mail.gmail.com>

hello,

how can i simply add new functions to module after its initialization
(Py_InitModule())?  I'm missing something like
PyModule_AddCFunction().

thank you

L.

From fetchinson at googlemail.com  Tue Jun 22 11:44:38 2010
From: fetchinson at googlemail.com (Daniel Fetchinson)
Date: Tue, 22 Jun 2010 11:44:38 +0200
Subject: [Python-Dev] adding new function
In-Reply-To: <AANLkTik31g_9oKU9DjSvihFEpu6PlAOf9Vm44FxesDOw@mail.gmail.com>
References: <AANLkTik31g_9oKU9DjSvihFEpu6PlAOf9Vm44FxesDOw@mail.gmail.com>
Message-ID: <AANLkTily_udwJTJScVezHzNtY6rhrDdiaU0Ass0_Hfsv@mail.gmail.com>

> how can i simply add new functions to module after its initialization
> (Py_InitModule())?  I'm missing something like
> PyModule_AddCFunction().

This type of question really belongs to python-list aka
comp.lang.python which I CC-d now. Please keep the discussion on that
list.

Cheers,
Daniel


-- 
Psss, psss, put it down! - http://www.cafepress.com/putitdown

From ncoghlan at gmail.com  Tue Jun 22 12:41:39 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Tue, 22 Jun 2010 20:41:39 +1000
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87k4pr36le.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
	<87k4pr36le.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTim2VPW0IuKqSk_vOOhP4pRywsduX7Bo0SRu1qYl@mail.gmail.com>

On Tue, Jun 22, 2010 at 4:49 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> ?> Which works if and only if your outputs are truly unicode-able.
>
> With PEP 383, they always are, as long as you allow Unicode to be
> decoded to the same garbage your bytes-based program would have
> produced anyway.

Could it be that part of the problem here is that we need to better
advertise "errors='surrogateescape'" as a mechanism for decoding
incorrectly encoded data according to a nominal codec without throwing
UnicodeDecode and UnicodeEncode errors all over the place? Currently
it only garners a mention in the docs in the context of the os module,
the list of error handlers in the codecs module and as a default error
handler argument in the tarfile module.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Tue Jun 22 12:52:39 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Tue, 22 Jun 2010 20:52:39 +1000
Subject: [Python-Dev] [OT] glyphs [was Re:  email package status in 3.X]
In-Reply-To: <hvp4ll$230$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
	<hvp4ll$230$1@dough.gmane.org>
Message-ID: <201006222052.39734.steve@pearwood.info>

On Tue, 22 Jun 2010 11:46:27 am Terry Reedy wrote:
> 3. Unicode disclaims direct representation of glyphic variants
> (though again, exceptions were made for asian acceptance). For
> example, in English, mechanically printed 'a' and 'g' are different
> from manually printed 'a' and 'g'. Representing both by the same
> codepoint, in itself, loses information. One who wishes to preserve
> the distinction must instead use a font tag or perhaps a
> <handprinted> tag. Similarly, older English had a significantly
> different glyph for 's', which looks more like a modern 'f'.

An unfortunate example, as the old English long-s gets its own Unicode 
codepoint.

http://en.wikipedia.org/wiki/Long_s


-- 
Steven D'Aprano

From stephen at xemacs.org  Tue Jun 22 13:31:13 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 20:31:13 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100622055040.GE5787@unaka.lan>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
Message-ID: <87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>

Toshio Kuratomi writes:

 > I'll definitely buy that.  Would urljoin(b_base, b_subdir) => bytes and
 > urljoin(u_base, u_subdir) => unicode be acceptable though?

Probably.  

But it doesn't matter what I say, since Guido has defined that as
"polymorphism" and approved it in principle.

 > (I think, given other options, I'd rather see two separate
 > functions, though.

Yes.

 > If you want to deal with things like this::
 >   http://host/caf?

Yes.

 > At that point you are no longer dealing with the sequence of
 > characters talked about in the RFC.  You are dealing with data
 > which may or may not be text.

That's right, and I think that in most cases that is what programmers
want to be dealing with.  Let the library make sure that what goes on
the wire conforms to the RFC.  I don't want to know about it, I want
to work with the content of the URI.

 > The proliferation of encoding I agree is a thing that is ugly.
 > Although, if I'm thinking correctly, that only matters when you
 > want to allow mixing bytes and unicode, correct?

Well you need to know a fair amount about the encoding: that the
reserved bytes are used as defined in the RFC, for example.

 > For debugging, I'm either not understanding or you're wrong.  If I'm given
 > an arbitrary sequence of bytes how do I sanely store them as str internally?

If it's really arbitrary, you use either a mapping to private space or
PEP 383, and accept that it won't make sense.  But in most cases you
should be able to achieve a fair degree of sanity.

 > If I transform them using an encoding that anticipates the full range of
 > bytes I may be able to display some representation of them but it's not
 > necessarily the sanest method of display (for instance, if I know that path
 > element 1 is always going to be a utf8 encoded string and path element 2 is
 > always shift-jis encoded, and path element 3 is binary data, I could
 > construct a much saner display method than treating the whole thing as
 > latin1).

And I think in most cases you will know, although the cases where
you'll know will be because of a system-wide encoding.

 > What is your basis for asserting that URIs that aren't sanely treated as
 > text are garbage?

I don't mean we can throw them away, I mean we can't do any sensible
processing on them.  You at least need to know about the reseved
delimiters.  In the same way that Philip used 'garbage' for the
"unknown" encoding.  And in the sense of "garbage in, garbage out".

 > unicode handling redesign.  I'm stating my reading of the RFC not to defend
 > the use case Philip has, but because I think that the outlook that non-text
 > uris (before being percentencoded) are violations of the RFC

That's not what I'm saying.  What I'm trying to point out is that
manipulating a bytes object as an URI sort of presumes a lot about its
encoding as text.  Since many of the URIs we deal with are more or
less textual, why not take advantage of that?

From stephen at xemacs.org  Tue Jun 22 13:55:41 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Tue, 22 Jun 2010 20:55:41 +0900
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTim2VPW0IuKqSk_vOOhP4pRywsduX7Bo0SRu1qYl@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<8739wg469t.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>
	<87k4pr36le.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTim2VPW0IuKqSk_vOOhP4pRywsduX7Bo0SRu1qYl@mail.gmail.com>
Message-ID: <87aaqn2sea.fsf@uwakimon.sk.tsukuba.ac.jp>

Nick Coghlan writes:
 > On Tue, Jun 22, 2010 at 4:49 PM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
 > > ?> Which works if and only if your outputs are truly unicode-able.
 > >
 > > With PEP 383, they always are, as long as you allow Unicode to be
 > > decoded to the same garbage your bytes-based program would have
 > > produced anyway.
 > 
 > Could it be that part of the problem here is that we need to better
 > advertise "errors='surrogateescape'" as a mechanism for decoding
 > incorrectly encoded data according to a nominal codec without throwing
 > UnicodeDecode and UnicodeEncode errors all over the place?

Yes, I think that would make the "use str internally to urllib"
strategy a lot more palatable.  But it still needs to be combined with
a program architecture of decode-process-encode, which might require
substantial refactoring for some existing modules.


From fdrake at acm.org  Tue Jun 22 14:40:29 2010
From: fdrake at acm.org (Fred Drake)
Date: Tue, 22 Jun 2010 08:40:29 -0400
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
Message-ID: <AANLkTikVtfCipb5zeeeuCicBz34-ltkTUTeyKQjx3ump@mail.gmail.com>

On Tue, Jun 22, 2010 at 2:21 AM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> I had thought there was a conscious decision to not change any existing
> classes from old-style to new-style.

I thought so as well.  Changing any class from old-style to new-style
risks breaking applications in obscure & mysterious ways.  (Yes, we've
been bitten by this before; it's a real problem.)


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"A storm broke loose in my mind."  --Albert Einstein

From benjamin at python.org  Tue Jun 22 14:48:25 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 22 Jun 2010 07:48:25 -0500
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
Message-ID: <AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>

2010/6/22 Raymond Hettinger <raymond.hettinger at gmail.com>:
> There's an entry in whatsnew for 2.7 to the effect of "The UserDict class is
> now a new-style class".
> I had thought there was a conscious decision to not change any existing
> classes from old-style to new-style.  IIRC, Martin had championed this idea
> and had rejected all of proposals to make existing classes inherit from
> object.

IIRC this was because UserDict tries to be a MutableMapping but abcs
require new style classes.


-- 
Regards,
Benjamin

From lvh at laurensvh.be  Tue Jun 22 15:23:36 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Tue, 22 Jun 2010 15:23:36 +0200
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <AANLkTikVtfCipb5zeeeuCicBz34-ltkTUTeyKQjx3ump@mail.gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
	<AANLkTikVtfCipb5zeeeuCicBz34-ltkTUTeyKQjx3ump@mail.gmail.com>
Message-ID: <AANLkTinGpYKcEjDMnPNraQSTXvP1Z4SUulmeMze0_DSO@mail.gmail.com>

On Tue, Jun 22, 2010 at 2:40 PM, Fred Drake <fdrake at acm.org> wrote:
> On Tue, Jun 22, 2010 at 2:21 AM, Raymond Hettinger
> <raymond.hettinger at gmail.com> wrote:
>> I had thought there was a conscious decision to not change any existing
>> classes from old-style to new-style.
>
> I thought so as well. ?Changing any class from old-style to new-style
> risks breaking applications in obscure & mysterious ways. ?(Yes, we've
> been bitten by this before; it's a real problem.)
>
>
> ?-Fred

+1. I've been bitten by this more than once in some of the more
obscure old(-style) classes in twisted.python.

Laurens

From murman at gmail.com  Tue Jun 22 15:24:28 2010
From: murman at gmail.com (Michael Urman)
Date: Tue, 22 Jun 2010 08:24:28 -0500
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>
	<87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTinv0RpzrqtQiL3hN170-zICQQMcDUJJ_2Kb2NKH@mail.gmail.com>

On Tue, Jun 22, 2010 at 00:28, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Michael Urman writes:
>
> ?> It is somewhat troublesome that there doesn't appear to be an obvious
> ?> built-in idempotent-when-possible function that gives back the
> ?> provided bytes/str,
>
> If you want something idempotent, it's already the case that
> bytes(b'abc') => b'abc'. ?What might be desirable is to make
> bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII
> (or maybe ISO 8859/1).

By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding,
errors) that would pass an instance of bytes through, or encode an
instance of str. And of course a to_str that performs similarly,
passing str through and decoding bytes. While bytes(b'abc') will give
me b'abc', neither bytes('abc') nor bytes(b'abc', 'latin-1') get me
the b'abc' I want to see.

These are trivial functions; I just don't fully understand why the
capability isn't baked in. A one argument call is idempotent capable;
a two argument call isn't as it only converts.

It's not a completely made-up requirement either. A cross-platform
piece of software may need to present to a user items that are
sometimes str and sometimes bytes - particularly filenames.

> Unfortunately, str(b'abc') already does work, but
>
> steve at uwakimon ~ $ python3.1
> Python 3.1.2 (release31-maint, May 12 2010, 20:15:06)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> str(b'abc')
> "b'abc'"
>>>>
>
> Oops. ?You can see why that probably "should" be the case

Sure, and I love having this there for debugging. But this is hardly
good enough for presenting to a user once you leave ascii.
>>> u = '???'
>>> sjis = bytes(u, 'shift-jis')
>>> utf8 = bytes(u, 'utf-8')
>>> str(sjis), str(utf8)
("b'\\x93\\xfa\\x96{\\x8c\\xea'",
"b'\\xe6\\x97\\xa5\\xe6\\x9c\\xac\\xe8\\xaa\\x9e'")

When I happen to know the encoding, I can reverse it much more cleanly.
>>> str(sjis, 'shift-jis'), str(utf8, 'utf-8')
('???', '???')

But I can't mix this approach with str instances without writing a
different invocation.
>>> str(u, 'argh')
TypeError: decoding str is not supported

-- 
Michael Urman

From guido at python.org  Tue Jun 22 18:17:31 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 22 Jun 2010 09:17:31 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100622055040.GE5787@unaka.lan>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
Message-ID: <AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>

[Just addressing one little issue here; generally I'm just happy that
we're discussing this issue in such detail from so many points of
view.]

On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi <a.badger at gmail.com> wrote:
>[...] Would urljoin(b_base, b_subdir) => bytes and
> urljoin(u_base, u_subdir) => unicode be acceptable though? ?(I think, given
> other options, I'd rather see two separate functions, though. ?It seems more
> discoverable and less prone to taking bad input some of the time to have two
> functions that clearly only take one type of data apiece.)

Hm. I'd rather see a single function (it would be "polymorphic" in my
earlier terminology). After all a large number of string method calls
(and some other utility function calls) already look the same
regardless of whether they are handling bytes or text (as long as it's
uniform). If the building blocks are all polymorphic it's easier to
create additional polymorphic functions.

FWIW, there are two problems with polymorphic functions, though they
can be overcome:

(1) Literals.

If you write something like x.split('&') you are implicitly assuming x
is text. I don't see a very clean way to overcome this; you'll have to
implement some kind of type check e.g.

    x.split('&') if isinstance(x, str) else x.split(b'&')

A handy helper function can be written:

  def literal_as(constant, variable):
      if isinstance(variable, str):
          return constant
      else:
          return constant.encode('utf-8')

So now you can write x.split(literal_as('&', x)).

(2) Data sources.

These can be functions that produce new data from non-string data,
e.g. str(<int>), read it from a named file, etc. An example is read()
vs. write(): it's easy to create a (hypothetical) polymorphic stream
object that accepts both f.write('booh') and f.write(b'booh'); but you
need some other hack to make read() return something that matches a
desired return type. I don't have a generic suggestion for a solution;
for streams in particular, the existing distinction between binary and
text streams works, of course, but there are other situations where
this doesn't generalize (I think some XML interfaces have this
awkwardness in their API for converting a tree to a string).

-- 
--Guido van Rossum (python.org/~guido)

From tseaver at palladion.com  Tue Jun 22 18:37:14 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Tue, 22 Jun 2010 12:37:14 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>	<hvijae$9tc$1@dough.gmane.org>
	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
Message-ID: <hvqorq$i69$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Jesse Noller wrote:
> 
> On Jun 19, 2010, at 10:13 AM, Tres Seaver <tseaver at palladion.com> wrote:

>>> Nothing is set in stone; if something is incredibly painful, or worse
>>> yet broken, then someone needs to file a bug, bring it to this list,
>>> or bring up a patch.
>> Or walk away.
>>
> 
> Ok. If you want.

I specifically said I *didn't* want to walk away.  I'm pointing out that
in the general case, the ordinary user who finds something incredibly
painful or broken is far more likely to walk away from the platform than
try to fix it, especially if there are available alternatives (e.g.,
Ruby, Python 2) where the pain level for that user's application is lower.

>>> I guess tutorial welcome, rather than patch welcome then ;)
>> The only folks who can write the tutorial are the ones who have  
>> already drunk the koolaid.  Note that I've been making my living with Python  
>> for about twelve years now, and would *like* to use Python3, but can't,  
>> yet, and therefore haven't taken the first sip.
> 
> Why can't you? Is it a bug?

It's not *a* bug, it is that I do my day to day work on very large
applications which depend on a large number of not-yet-ported libraries.
 This barrier is the negative "network effect" which is the whole point
of this thread:  there is nothing wrong with Python3 except that, to use
it, I have to stop doing the work which pays to do an
indeterminately-large amount of "hobby" work (of which I already do
quite a lot).

> Let's file it and fix it. Is it that you  
> need a dependency ported?

I need dozens of them ported, and am working on some of them in the
aforementioned "copious spare time."

> Cool - let's bring it up to the maintainers,  
> or this list, or ask the PSF to push resources into helping port.  
> Anything but nothing.

Nothing is the default:  I am already successful with Python 2, and
can't be successfulwith Python 3 (in the sense of delivering timely,
cost-effective solutions to my customers) until *all* those dependencies
are ported and stable there.

> If what you're saying is that python 3 is a completely unsuitable  
> platform, well, then yeah - we can all "fix" it or walk away.

I didn't say that:  I said that Python 3 is unsuitable *today* for the
work I'm doing, and that the relative wins it provides over Python 2 are
dwarfed by the effort required to do all those ports myself.

>>>> IOW, 3.x has broken TOOOWTDI for me in some areas.  There may
>>>> be obvious ways to do it, but, as per the Zen of Python, "that
>>>> way may not be obvious at first unless you're Dutch".  ;-)

OT:  The Dutch smiley there doesn't actually help anything but undercut
any point to having TOOOWTDI in the list at all.

>>> What areas. We need specifics which can either be:
>>>
>>> 1> Shot down.
>>> 2> Turned into bugs, so they can be fixed
>>> 3> Documented in the core documentation.

>> That's bloody ironic in a thread which had pointed at reasons why  
>> people are not even considering Py3 for their projects:  those folks won't  
>> even find the issues due to the lack of confidence in the suitability of  
>> the platform.
> 
> What I saw was a thread about some issues in email, and cgi. We have  
> some work being done to address the issue. This will help resolve some  
> of the issues.
> 
> If there are other issues, then we should step up and either help, or  
> get out ofthe way. Arguing about the viability of a platform we knew  
> would take a bit for adoption is silly and breeds ill will.

I'm not arguing about viability:  there are obviously users for whom
Python 3 is not only viable, but superior to Python 2.  However, I am
quite confident that many pro-Python 3 folks arguing here underestimate
the scope of the issues which have generated the (self-fullfilling) "not
yet" perception.

> It's not a turd, and it's not hopeless, in fact rumor has it NumPy  
> will be ported soon which is a major stepping stone.

Sure, for the (far from trivial) subset of the community doing numerical
work.

> The only way to counteract this meme that python 3 is horribly  
> broken is to prove that it's not, fix bugs, and move on. There's no  
> point debating relative turdiness here.

Any "turdiness" (which I am *not* arguing for) is a natural consequence
of the kinds of backward incompatibilities which were *not* ruled out
for Python 3, along with the (early, now waning) "build it and they will
 come" optimism about adoption rates.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwg5rIACgkQ+gerLs4ltQ6J7wCdFkQL7XeKtBM407Z5D2rSKk8n
EWYAoJUfW+JgURUz7NJcWmqFw3PkNYde
=WZEv
-----END PGP SIGNATURE-----


From ronaldoussoren at mac.com  Tue Jun 22 18:39:03 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Tue, 22 Jun 2010 18:39:03 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk> <4C1FE4AF.80009@v.loewis.de>
	<AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>
Message-ID: <EAE2C517-C5EA-4EF7-A4A0-286C3B08381D@mac.com>


On 22 Jun, 2010, at 3:38, Alexander Belopolsky wrote:

> On Mon, Jun 21, 2010 at 6:16 PM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>>> The test_posix failure is a regression from 2.6 (but it only shows up on
>>> some machines - it is caused by a fairly braindead implementation of a
>>> couple of posix apis by Apple apparently).
>>> 
>>> http://bugs.python.org/issue7900
>> 
>> Ah, that one. I definitely think this should *not* block the release:
> 
> I agree that this is nowhere near being a release blocker, but I think
> it would be nice to do something about it before the final release.
> 
>> a) there is no clear solution in sight. So if we wait for it resolved,
>>   it could take months until we get a 2.7 release.
> 
> The ideal solution will have to wait until Apple gets its act together
> and fixed the problem on their end.  I would say "months" is an overly
> optimistic time estimate for that.  

I'd say there is no chance at all that this will be fixed in OSX 10.6, with some luck they'll change this in 10.7.

> However, the issue is a regression
> from prior versions.  In 2.5 getgroups would truncate the list to 16
> groups, but won't crash.  More importantly the 16 groups returned
> would be correct per-process groups and not something immune to
> setgroup changes.
> 
> I proposed a very simple fix:
> 
> http://bugs.python.org/file16326/no-darwin-ext.diff
> 
> which simply minimally reverts the change that introduced the regression.

That is one way to fix it, another just as valid fix is to change posix.getgroups to be able to return more than 16 groups on OSX (see my patch in issue7900). 

Both are valid fixes, both have both advantages and disadvantages.

Your proposal:
* Reverts to the behavior in 2.6
* Ensures that posix.getgroups and posix.setgroups are internally consistent

My proposal:
* Uses the newer ABI, which is more likely to be the one Apple wants you to use
* Is compatible with system tools (that is, posix.getgroups() agrees with id(1))
* Is compatible with /usr/bin/python
* results in posix.getgroups not reflecting results of posix.setgroups

What I haven't done yet, and probably should, is to check how either implementation of getgroups interacts with groups in the System Preferences panel and with groups in managed environment (using OSX Server).

My gut feeling is that second option (my proposal) would give more useful semantics, but that said: I almost never write code where I need os.setgroups.

Ronald
-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/57073a32/attachment-0001.bin>

From dirkjan at ochtman.nl  Tue Jun 22 18:54:21 2010
From: dirkjan at ochtman.nl (Dirkjan Ochtman)
Date: Tue, 22 Jun 2010 18:54:21 +0200
Subject: [Python-Dev] State of json in 2.7
Message-ID: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com>

It looks like simplejson 2.1.0 and 2.1.1 have been released:

http://bob.pythonmac.org/archives/2010/03/10/simplejson-210/
http://bob.pythonmac.org/archives/2010/03/31/simplejson-211/

It looks like any changes that didn't come from the Python tree didn't
go into the Python tree, either.

I guess we can't put these changes into 2.7 anymore? How can we make
this better next time?

Cheers,

Dirkjan

From benjamin at python.org  Tue Jun 22 18:56:09 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 22 Jun 2010 11:56:09 -0500
Subject: [Python-Dev] State of json in 2.7
In-Reply-To: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com>
References: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com>
Message-ID: <AANLkTilTBED-RkN4bTZc6pN6QqDcW6ShREnfP3Jh6zz9@mail.gmail.com>

2010/6/22 Dirkjan Ochtman <dirkjan at ochtman.nl>:
> I guess we can't put these changes into 2.7 anymore? How can we make
> this better next time?

Never have externally maintained packages.


-- 
Regards,
Benjamin

From raymond.hettinger at gmail.com  Tue Jun 22 18:24:42 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 22 Jun 2010 09:24:42 -0700
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
Message-ID: <4FBE15CB-2397-46B2-8417-588F8785BA20@gmail.com>


On Jun 22, 2010, at 5:48 AM, Benjamin Peterson wrote:

> 2010/6/22 Raymond Hettinger <raymond.hettinger at gmail.com>:
>> There's an entry in whatsnew for 2.7 to the effect of "The UserDict class is
>> now a new-style class".
>> I had thought there was a conscious decision to not change any existing
>> classes from old-style to new-style.  IIRC, Martin had championed this idea
>> and had rejected all of proposals to make existing classes inherit from
>> object.
> 
> IIRC this was because UserDict tries to be a MutableMapping but abcs
> require new style classes.

ISTM, this change should be reverted to the way it was in 2.6.

The registration was already working fine:

Python 2.6.4 (r264:75821M, Oct 27 2009, 19:48:32) 
[GCC 4.0.1 (Apple Inc. build 5493)] on darwin
>>> import UserDict
>>> import collections
>>> collections.MutableMapping.register(UserDict.UserDict)
>>> issubclass(UserDict.UserDict, collections.MutableMapping)
True

We've didn't have any problems with this registration
nor did there seem to be an issue with UserDict not 
implementing dictviews.

Please revert this change.  UserDicts have a long history
and are used by a lot of code, so we need to avoid
unnecessary breakage.


Thank you,


Raymond


From ianb at colorstudy.com  Tue Jun 22 19:03:29 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 22 Jun 2010 12:03:29 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>

On Tue, Jun 22, 2010 at 6:31 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Toshio Kuratomi writes:
>
>  > I'll definitely buy that.  Would urljoin(b_base, b_subdir) => bytes and
>  > urljoin(u_base, u_subdir) => unicode be acceptable though?
>
> Probably.
>
> But it doesn't matter what I say, since Guido has defined that as
> "polymorphism" and approved it in principle.
>
>  > (I think, given other options, I'd rather see two separate
>  > functions, though.
>
> Yes.
>
>  > If you want to deal with things like this::
>  >   http://host/caf? <http://host/caf%C3%A9>
>
> Yes.
>

Just for perspective, I don't know if I've ever wanted to deal with a URL
like that.  I know how it is supposed to work, and I know what a browser
does with that, but so many tools will clean that URL up *or* won't be able
to deal with it at all that it's not something I'll be passing around.  So
from a practical point of view this really doesn't come up, and if it did it
would be in a situation where you could easily do something ad hoc (though
there is not currently a routine to quote unsafe characters in a URL... that
would be helpful, though maybe urllib.quote(url.encode('utf8'), '%/:') would
do it).

Also while it is problematic to treat the URL-unquoted value as text
(because it has an unknown encoding, no encoding, or regularly a mixture of
encodings), the URL-quoted value is pretty easy to pass around, and
normalization (in this case to http://host/caf%C3%A9) is generally fine.

While it's nice to be correct about encodings, sometimes it is impractical.
And it is far nicer to avoid the situation entirely.  That is, decoding
content you don't care about isn't just inefficient, it's complicated and
can introduce errors.  The encoding of the underlying bytes of a %-decoded
URL is largely uninteresting.  Browsers (whose behavior drives a lot of
convention) don't touch any of that encoding except lately occasionally to
*display* some data in a more friendly way.  But it's only display, and
errors just make it revert to the old encoded display.

Similarly I'd expect (from experience) that a programmer using Python to
want to take the same approach, sticking with unencoded data in nearly all
situations.


-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/ebd3c163/attachment.html>

From alexander.belopolsky at gmail.com  Tue Jun 22 19:05:38 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Tue, 22 Jun 2010 13:05:38 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <EAE2C517-C5EA-4EF7-A4A0-286C3B08381D@mac.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk> <4C1FE4AF.80009@v.loewis.de>
	<AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>
	<EAE2C517-C5EA-4EF7-A4A0-286C3B08381D@mac.com>
Message-ID: <AANLkTimrLwsklrQzBLbjf0LOCycp_gRa97gqc721NsNs@mail.gmail.com>

On Tue, Jun 22, 2010 at 12:39 PM, Ronald Oussoren
<ronaldoussoren at mac.com> wrote:
..
> Both are valid fixes, both have both advantages and disadvantages.
>
> Your proposal:
> * Reverts to the behavior in 2.6
> * Ensures that posix.getgroups and posix.setgroups are internally consistent
>
It is also very simple and since posix module worked fine on OSX for
years without _DARWIN_C_SOURCE, I think this is a very low risk
change.

> My proposal:
> * Uses the newer ABI, which is more likely to be the one Apple wants you to use

I don't think so.  In getgroups(2) I see

LEGACY DESCRIPTION
     If _DARWIN_C_SOURCE is defined, getgroups() can return more than
{NGROUPS_MAX} groups.

This suggests that this is legacy behavior.  Newer applications should
use getgrouplist instead.

> * Is compatible with system tools (that is, posix.getgroups() agrees with id(1))

I have not tested this recently, but I think if you exec id from a
program after a call to setgroups(), it will return process groups,
not user groups.

> * Is compatible with /usr/bin/python

I am sure that one this issue is fixed upstream, Apple will pick it up
with the next version.

> * results in posix.getgroups not reflecting results of posix.setgroups
>

This effectively substitutes getgrouplist called on the current user
for getgroups.  In 3.x, I believe the correct action will be to
provide direct access to getgrouplist which is while not POSIX (yet?),
is widely available.

From benjamin at python.org  Tue Jun 22 19:08:02 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Tue, 22 Jun 2010 12:08:02 -0500
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <4FBE15CB-2397-46B2-8417-588F8785BA20@gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
	<4FBE15CB-2397-46B2-8417-588F8785BA20@gmail.com>
Message-ID: <AANLkTil4XWNZSzsa8j5r2eCbp9mugzUj2zR1WeVH9FBj@mail.gmail.com>

2010/6/22 Raymond Hettinger <raymond.hettinger at gmail.com>:
>
> On Jun 22, 2010, at 5:48 AM, Benjamin Peterson wrote:
>
>> 2010/6/22 Raymond Hettinger <raymond.hettinger at gmail.com>:
>>> There's an entry in whatsnew for 2.7 to the effect of "The UserDict class is
>>> now a new-style class".
>>> I had thought there was a conscious decision to not change any existing
>>> classes from old-style to new-style. ?IIRC, Martin had championed this idea
>>> and had rejected all of proposals to make existing classes inherit from
>>> object.
>>
>> IIRC this was because UserDict tries to be a MutableMapping but abcs
>> require new style classes.
>
> ISTM, this change should be reverted to the way it was in 2.6.
>
> The registration was already working fine:

Actually I believe it was an error that it could. There was a typo in
abc.py which prevented it from raising errors when non new-style class
objects were passed in.


-- 
Regards,
Benjamin

From janssen at parc.com  Tue Jun 22 19:17:01 2010
From: janssen at parc.com (Bill Janssen)
Date: Tue, 22 Jun 2010 10:17:01 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTilB0PJCqEhMfqAHnQrreYaO7c5gn0ZkL-9RzK5j@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk>
	<4C1FE030.7020700@voidspace.org.uk>
	<AANLkTinWl76iKLdja0dGC8Fjzh9RiZ7M9DJ91JaXjxZy@mail.gmail.com>
	<AANLkTinUu3DSkzgB5b2YKFznUJeHKUmdMXS44aHAUkim@mail.gmail.com>
	<AANLkTimwTtqA9wgaoZffU8QzHvbmyOAyu_aD3kxG6_jo@mail.gmail.com>
	<AANLkTin1rCySAEMxq2hAEeF8xF2t3Ibj89SG1KW-2C_T@mail.gmail.com>
	<1180.1277170019@parc.com>
	<AANLkTilB0PJCqEhMfqAHnQrreYaO7c5gn0ZkL-9RzK5j@mail.gmail.com>
Message-ID: <1422.1277227021@parc.com>

Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:

> On Mon, Jun 21, 2010 at 9:26 PM, Bill Janssen <janssen at parc.com> wrote:
> ..
> > Though, isn't that behavior of urllib.proxy_bypass another bug?
> 
> I don't know.  Ask Ronald.

Hmmm.  I brought up the System Preferences panel on my Mac, and sure
enough, there's a checkbox, "Exclude simple hostnames".  So I guess it's
not a bug, though none of my Macs are configured that way.

Bill

From a.badger at gmail.com  Tue Jun 22 19:21:23 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Tue, 22 Jun 2010 13:21:23 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100622172123.GG5787@unaka.lan>

On Tue, Jun 22, 2010 at 08:31:13PM +0900, Stephen J. Turnbull wrote:
> Toshio Kuratomi writes:
>  > unicode handling redesign.  I'm stating my reading of the RFC not to defend
>  > the use case Philip has, but because I think that the outlook that non-text
>  > uris (before being percentencoded) are violations of the RFC
> 
> That's not what I'm saying.  What I'm trying to point out is that
> manipulating a bytes object as an URI sort of presumes a lot about its
> encoding as text.

I think we're more or less in agreement now but here I'm not sure.  What
manipulations are you thinking about?  Which stage of URI construction are
you considering?

I've just taken a quick look at python3.1's urllib module and I see that
there is a bit of confusion there.  But it's not about unicode vs bytes but
about whether a URI should be operated on at the real URI level or the
data-that-makes-a-uri level.

* all functions I looked at take python3 str rather than bytes so there's no
  confusing stuff here
* urllib.request.urlopen takes a strict uri.  That means that you must have
  a percent encoded uri at this point
* urllib.parse.urljoin takes regular string values
* urllib.parse and urllib.unparse take regular string values

> Since many of the URIs we deal with are more or
> less textual, why not take advantage of that?
>
Cool, so to summarize what I think we agree on:

* Percent encoded URIs are text according to the RFC.
* The data that is used to construct the URI is not defined as text by the
  RFC.
* However, it is very often text in an unspecified encoding
* It is extremely convenient for programmers to be able to treat the data
  that is used to form a URI as text in nearly all common cases.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/a926e262/attachment.pgp>

From guido at python.org  Tue Jun 22 18:53:00 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 22 Jun 2010 09:53:00 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <89AE7ED6-FB94-45DA-9432-7FCBA25A56BF@gmail.com>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com> 
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com> 
	<89AE7ED6-FB94-45DA-9432-7FCBA25A56BF@gmail.com>
Message-ID: <AANLkTin2LbL1WOz_81WKC9FB2DfjEI36MpaJqbOfqhEl@mail.gmail.com>

On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
>
> On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
>
> ??This is a common pain-point for porting software to 3.x - you had a
> string, it kinda worked most of the time before, but now you need to keep
> track of text too and the functions which seemed?to work on bytes no longer
> do.
>
> Thanks Glyph. ?That is a nice summary of one kind of challenge facing
> programmers.

Ironically, Glyph also described the pain in 2.x: it only "kinda" worked.

-- 
--Guido van Rossum (python.org/~guido)

From raymond.hettinger at gmail.com  Tue Jun 22 19:31:36 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 22 Jun 2010 10:31:36 -0700
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <AANLkTil4XWNZSzsa8j5r2eCbp9mugzUj2zR1WeVH9FBj@mail.gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
	<4FBE15CB-2397-46B2-8417-588F8785BA20@gmail.com>
	<AANLkTil4XWNZSzsa8j5r2eCbp9mugzUj2zR1WeVH9FBj@mail.gmail.com>
Message-ID: <C3E0AF18-848C-4C1E-AE2C-F8AA022DACDD@gmail.com>


On Jun 22, 2010, at 10:08 AM, Benjamin Peterson wrote:
> . There was a typo in
> abc.py which prevented it from raising errors when non new-style class
> objects were passed in.

For 2.x, that was probably a good thing, a happy accident
that made it possible to register existing mapping classes
as a MutableMapping.

"Fixing" that typo will break code that currently uses ABCs
with old-style classes.  

I believe we are better-off leaving this as it was released in 2.6.


Raymond

From guido at python.org  Tue Jun 22 18:49:27 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 22 Jun 2010 09:49:27 -0700
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com> 
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com> 
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com> 
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com> 
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com> 
	<87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTik236Mq6qUChBqOZ_j6uJhOLmAoTjI5C_mtKv-l@mail.gmail.com>

On Mon, Jun 21, 2010 at 10:28 PM, Stephen J. Turnbull
<stephen at xemacs.org> wrote:
> Michael Urman writes:
>
> ?> It is somewhat troublesome that there doesn't appear to be an obvious
> ?> built-in idempotent-when-possible function that gives back the
> ?> provided bytes/str,
>
> If you want something idempotent, it's already the case that
> bytes(b'abc') => b'abc'. ?What might be desirable is to make
> bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII
> (or maybe ISO 8859/1).

No, no, no! That's just what Python 2 did.

> Unfortunately, str(b'abc') already does work, but
>
> steve at uwakimon ~ $ python3.1
> Python 3.1.2 (release31-maint, May 12 2010, 20:15:06)
> [GCC 4.3.4] on linux2
> Type "help", "copyright", "credits" or "license" for more information.
>>>> str(b'abc')
> "b'abc'"
>>>>
>
> Oops. ?You can see why that probably "should" be the case.

There is a near-contract that str() of pretty much anything returns a
"printable" version of that thing.

-- 
--Guido van Rossum (python.org/~guido)

From foom at fuhm.net  Tue Jun 22 20:07:18 2010
From: foom at fuhm.net (James Y Knight)
Date: Tue, 22 Jun 2010 14:07:18 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
Message-ID: <0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>


On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
> Similarly I'd expect (from experience) that a programmer using  
> Python to want to take the same approach, sticking with unencoded  
> data in nearly all situations.

Yeah. This is a real issue I have with the direction Python3 went: it  
pushes you into decoding everything to unicode early, even when you  
don't care -- all you really wanted to do is pass it from one API to  
another, with some well-defined transformations, which don't actually  
depend on it having being decoded properly. (For example, extracting  
the path from the URL and attempting to open it as a file on the  
filesystem.)

This means that Python3 programs can become *more* fragile in the face  
of random data you encounter out in the real world, rather than less  
fragile, which was the goal of the whole exercise.

The surrogateescape method is a nice workaround for this, but I can't  
help thinking that it might've been better to just treat stuff as  
possibly-invalid-but-probably-utf8 byte-strings from input, through  
processing, to output. It seems kinda too late for that, though: next  
time someone designs a language, they can try that. :)

James
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/7f537c1e/attachment-0001.html>

From mal at egenix.com  Tue Jun 22 20:09:24 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Tue, 22 Jun 2010 20:09:24 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
Message-ID: <4C20FC54.9000608@egenix.com>

Guido van Rossum wrote:
> [Just addressing one little issue here; generally I'm just happy that
> we're discussing this issue in such detail from so many points of
> view.]
> 
> On Mon, Jun 21, 2010 at 10:50 PM, Toshio Kuratomi <a.badger at gmail.com> wrote:
>> [...] Would urljoin(b_base, b_subdir) => bytes and
>> urljoin(u_base, u_subdir) => unicode be acceptable though?  (I think, given
>> other options, I'd rather see two separate functions, though.  It seems more
>> discoverable and less prone to taking bad input some of the time to have two
>> functions that clearly only take one type of data apiece.)
> 
> Hm. I'd rather see a single function (it would be "polymorphic" in my
> earlier terminology). After all a large number of string method calls
> (and some other utility function calls) already look the same
> regardless of whether they are handling bytes or text (as long as it's
> uniform). If the building blocks are all polymorphic it's easier to
> create additional polymorphic functions.
> 
> FWIW, there are two problems with polymorphic functions, though they
> can be overcome:
> 
> (1) Literals.
> 
> If you write something like x.split('&') you are implicitly assuming x
> is text. I don't see a very clean way to overcome this; you'll have to
> implement some kind of type check e.g.
> 
>     x.split('&') if isinstance(x, str) else x.split(b'&')
> 
> A handy helper function can be written:
> 
>   def literal_as(constant, variable):
>       if isinstance(variable, str):
>           return constant
>       else:
>           return constant.encode('utf-8')
> 
> So now you can write x.split(literal_as('&', x)).

This polymorphism is what we used in Python2 a lot to write
code that works for both Unicode and 8-bit strings.

Unfortunately, this no longer works as easily in Python3 due
to the literals sometimes having the wrong type and using
such a helper function slows things down a lot.

It would be great if we could have something like the above as
builtin method:

x.split('&'.as(x))

Perhaps something to discuss on the language summit at EuroPython.

Too bad we can't add such porting enhancements to Python2 anymore.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 22 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                26 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From a.badger at gmail.com  Tue Jun 22 20:44:44 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Tue, 22 Jun 2010 14:44:44 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTinv0RpzrqtQiL3hN170-zICQQMcDUJJ_2Kb2NKH@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>
	<87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTinv0RpzrqtQiL3hN170-zICQQMcDUJJ_2Kb2NKH@mail.gmail.com>
Message-ID: <20100622184444.GJ5787@unaka.lan>

On Tue, Jun 22, 2010 at 08:24:28AM -0500, Michael Urman wrote:
> On Tue, Jun 22, 2010 at 00:28, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> > Michael Urman writes:
> >
> > ?> It is somewhat troublesome that there doesn't appear to be an obvious
> > ?> built-in idempotent-when-possible function that gives back the
> > ?> provided bytes/str,
> >
> > If you want something idempotent, it's already the case that
> > bytes(b'abc') => b'abc'. ?What might be desirable is to make
> > bytes('abc') work and return b'abc', but only if 'abc' is pure ASCII
> > (or maybe ISO 8859/1).
> 
> By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding,
> errors) that would pass an instance of bytes through, or encode an
> instance of str. And of course a to_str that performs similarly,
> passing str through and decoding bytes. While bytes(b'abc') will give
> me b'abc', neither bytes('abc') nor bytes(b'abc', 'latin-1') get me
> the b'abc' I want to see.
> 
A month or so ago, I finally broke down and wrote a python2 library that had
these functions in it (along with a bunch of other trivial boilerplate
functions that I found myself writing over and over in different projects)

  https://fedorahosted.org/releases/k/i/kitchen/docs/api-text-converters.html#unicode-and-byte-str-conversion

I suppose I could port this to python3 and we could see if it gains adoption
as a thirdparty addon.  I have been hesitating over doing that since I don't
use python3 for everyday work and I have a vague feeling that 2to3 won't
understand what that code needs to do.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/d2ac6dea/attachment.pgp>

From brett at python.org  Tue Jun 22 21:27:49 2010
From: brett at python.org (Brett Cannon)
Date: Tue, 22 Jun 2010 12:27:49 -0700
Subject: [Python-Dev] State of json in 2.7
In-Reply-To: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com>
References: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com>
Message-ID: <AANLkTik25F1IBvA_9_bWt9gSg4SZnXwzgv5X8DN-vW_c@mail.gmail.com>

[cc'ing Bob on his gmail address; didn't have any other address handy
so I don't know if this will actually get to him]

On Tue, Jun 22, 2010 at 09:54, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
> It looks like simplejson 2.1.0 and 2.1.1 have been released:
>
> http://bob.pythonmac.org/archives/2010/03/10/simplejson-210/
> http://bob.pythonmac.org/archives/2010/03/31/simplejson-211/
>
> It looks like any changes that didn't come from the Python tree didn't
> go into the Python tree, either.

Has anyone asked Bob why he did this? There might be a logical reason.

-Brett

From bob at redivi.com  Tue Jun 22 22:11:10 2010
From: bob at redivi.com (Bob Ippolito)
Date: Tue, 22 Jun 2010 13:11:10 -0700
Subject: [Python-Dev] State of json in 2.7
In-Reply-To: <AANLkTik25F1IBvA_9_bWt9gSg4SZnXwzgv5X8DN-vW_c@mail.gmail.com>
References: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com>
	<AANLkTik25F1IBvA_9_bWt9gSg4SZnXwzgv5X8DN-vW_c@mail.gmail.com>
Message-ID: <AANLkTimPM6WpXJE5zXe0K3DBEtxuM_WvKZpZj9CjHUmz@mail.gmail.com>

On Tuesday, June 22, 2010, Brett Cannon <brett at python.org> wrote:
> [cc'ing Bob on his gmail address; didn't have any other address handy
> so I don't know if this will actually get to him]
>
> On Tue, Jun 22, 2010 at 09:54, Dirkjan Ochtman <dirkjan at ochtman.nl> wrote:
>> It looks like simplejson 2.1.0 and 2.1.1 have been released:
>>
>> http://bob.pythonmac.org/archives/2010/03/10/simplejson-210/
>> http://bob.pythonmac.org/archives/2010/03/31/simplejson-211/
>>
>> It looks like any changes that didn't come from the Python tree didn't
>> go into the Python tree, either.
>
> Has anyone asked Bob why he did this? There might be a logical reason.

I've just been busy. It's not trivial to move patches from one to the
other, so it's not something that has been easy for me to get around
to actually doing. It seems that more often than not when I have had
time to look at something, it didn't line up well with python's
release schedule.

(and speaking of busy I'm en route for a week long honeymoon so don't
expect much else from me on this thread)

-bob

From tjreedy at udel.edu  Tue Jun 22 22:19:45 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 22 Jun 2010 16:19:45 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <C56762CD-C47C-4153-BAED-32B6786BDE5C@twistedmatrix.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>	<20100621164650.16A093A414B@sparrow.telecommunity.com>	<AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>	<20100621181750.267933A404D@sparrow.telecommunity.com>
	<C56762CD-C47C-4153-BAED-32B6786BDE5C@twistedmatrix.com>
Message-ID: <hvr5t1$7bl$1@dough.gmane.org>

On 6/22/2010 1:22 AM, Glyph Lefkowitz wrote:

> The thing that I have heard in passing from a couple of folks with
> experience in this area is that some older software in asia would
> present characters differently if they were originally encoded in a
> "japanese" encoding versus a "chinese" encoding, even though they were
> really "the same" characters.

As I tried to say in another post, that to me is similar to wanting to 
present English text is different fonts depending on whether spoken by 
an American or Brit, or a modern person versus a Renaissance person.

> I do know that Han Unification is a giant political mess
> (<http://en.wikipedia.org/wiki/Han_unification> makes for some

Thanks, I will take a look.

> interesting reading), but my understanding is that it has handled enough
> of the cases by now that one can write software to display asian
> languages and it will basically work with a modern version of unicode.
> (And of course, there's always the private use area, as Stephen Turnbull
> pointed out.)
>
> Regardless, this is another example where keeping around a string isn't
> really enough. If you need to display a japanese character in a distinct
> way because you are operating in the japanese *script*, you need a tag
> surrounding your data that is a hint to its presentation. The fact that
> these presentation hints were sometimes determined by their encoding is
> an unfortunate historical accident.

Yes. The asian languages I know anything about seems to natively have 
almost none of the symbols English has, many borrowed from math, that 
have been pressed into service for text markup.


-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue Jun 22 22:32:40 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 22 Jun 2010 16:32:40 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTinv0RpzrqtQiL3hN170-zICQQMcDUJJ_2Kb2NKH@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>	<20100621145133.7F5333A404D@sparrow.telecommunity.com>	<AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>	<87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTinv0RpzrqtQiL3hN170-zICQQMcDUJJ_2Kb2NKH@mail.gmail.com>
Message-ID: <hvr6l9$adb$1@dough.gmane.org>

On 6/22/2010 9:24 AM, Michael Urman wrote:

> By idempotent-when-possible, I mean to_bytes(str_or_bytes, encoding,
> errors) that would pass an instance of bytes through, or encode an
> instance of str. And of course a to_str that performs similarly,
> passing str through and decoding bytes. While bytes(b'abc') will give
> me b'abc', neither bytes('abc') nor bytes(b'abc', 'latin-1') get me
> the b'abc' I want to see.
>
> These are trivial functions;
> I just don't fully understand why the capability isn't baked in.

Possible reasons: They are special purpose functions easily built on the 
basic functions provided. Fine for a 3rd party library. Most people do 
not need them. Some might be mislead by them. As other have said, "Not 
every one-liner should be builtin".

-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue Jun 22 22:41:54 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 22 Jun 2010 16:41:54 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTin2LbL1WOz_81WKC9FB2DfjEI36MpaJqbOfqhEl@mail.gmail.com>
References: <201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>
	<89AE7ED6-FB94-45DA-9432-7FCBA25A56BF@gmail.com>
	<AANLkTin2LbL1WOz_81WKC9FB2DfjEI36MpaJqbOfqhEl@mail.gmail.com>
Message-ID: <hvr76i$cgg$1@dough.gmane.org>

On 6/22/2010 12:53 PM, Guido van Rossum wrote:
> On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
> <raymond.hettinger at gmail.com>  wrote:
>>
>> On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
>>
>>    This is a common pain-point for porting software to 3.x - you had a
>> string, it kinda worked most of the time before, but now you need to keep
>> track of text too and the functions which seemed to work on bytes no longer
>> do.
>>
>> Thanks Glyph.  That is a nice summary of one kind of challenge facing
>> programmers.
>
> Ironically, Glyph also described the pain in 2.x: it only "kinda" worked.

The people with problematic code to convert must imclude some who 
managed to tolerate and perhaps suppress the pain. I suspect that 
conversion attempts brings it back to the surface. It is natural to 
blame the re-surfacer rather than the original source. (As in 'blame the 
messenger').


-- 
Terry Jan Reedy


From tjreedy at udel.edu  Tue Jun 22 22:47:58 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 22 Jun 2010 16:47:58 -0400
Subject: [Python-Dev] [OT] glyphs [was Re: email package status in 3.X]
In-Reply-To: <201006222052.39734.steve@pearwood.info>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100621184700.BAD7F3A404D@sparrow.telecommunity.com>	<hvp4ll$230$1@dough.gmane.org>
	<201006222052.39734.steve@pearwood.info>
Message-ID: <hvr7i1$dpq$1@dough.gmane.org>

On 6/22/2010 6:52 AM, Steven D'Aprano wrote:
> On Tue, 22 Jun 2010 11:46:27 am Terry Reedy wrote:
>> 3. Unicode disclaims direct representation of glyphic variants
>> (though again, exceptions were made for asian acceptance). For
>> example, in English, mechanically printed 'a' and 'g' are different
>> from manually printed 'a' and 'g'. Representing both by the same
>> codepoint, in itself, loses information. One who wishes to preserve
>> the distinction must instead use a font tag or perhaps a
>> <handprinted>  tag. Similarly, older English had a significantly
>> different glyph for 's', which looks more like a modern 'f'.
>
> An unfortunate example, as the old English long-s gets its own Unicode
> codepoint.

Whoops. I suppose I should thank you for the correction so I never make 
the same error again. Thank you.

> http://en.wikipedia.org/wiki/Long_s

Very interesting to find out the source of both the integral sign and 
shilling symbols.

-- 
Terry Jan Reedy


From cyounkins at gmail.com  Tue Jun 22 23:14:45 2010
From: cyounkins at gmail.com (Craig Younkins)
Date: Tue, 22 Jun 2010 17:14:45 -0400
Subject: [Python-Dev] Use of cgi.escape can lead to XSS vulnerabilities
Message-ID: <AANLkTil9vEOPLgenyPOu-F3OuC-R61ak_KoEvNARrGvW@mail.gmail.com>

Hello,

The method in question: http://docs.python.org/library/cgi.html#cgi.escape
http://svn.python.org/view/python/tags/r265/Lib/cgi.py?view=markup   # at
the bottom

"Convert the characters '&', '<' and '>' in string s to HTML-safe sequences.
Use this if you need to display text that might contain such characters in
HTML. If the optional flag quote is true, the quotation mark character ('"')
is also translated; this helps for inclusion in an HTML attribute value, as
in <A HREF="...">. If the value to be quoted might include single- or
double-quote characters, or both, consider using the quoteattr() function in
the xml.sax.saxutils module instead."

cgi.escape never escapes single quote characters, which can easily lead to a
Cross-Site Scripting (XSS) vulnerability. This seems to be known by many,
but a quick search reveals many are using cgi.escape for HTML attribute
escaping.

The intended use of this method is unclear to me. Up to and including the
latest published version of Mako (0.3.3), this method was the HTML escaping
method. Used in this manner, single-quoted attributes with user-supplied
data are easily susceptible to cross-site scripting vulnerabilities.

Proof of concept in Mako:
>>> from mako.template import Template
>>> print Template("<div class='${data}'>",
default_filters=['h']).render(data="' onload='alert(1);' id='")
<div class='' onload='alert(1);' id=''>

I've emailed Michael Bayer, the creator of Mako, and this will be fixed in
version 0.3.4.

While the documentation says "if the value to be quoted might include
single- or double-quote characters... [use the] xml.sax.saxutils module
instead," it also implies that this method will make input safe for HTML.
Because this method escapes 4 of the 5 key XML characters, it is reasonable
to expect some will use it in the manner Mako did.

I suggest rewording the documentation for the method making it more clear
what it should and should not be used for. I would like to see the method
changed to properly escape single-quotes, but if it is not changed, the
documentation should explicitly say this method does not make input safe for
inclusion in HTML.

Shameless plug: http://www.PythonSecurity.org/<http://www.pythonsecurity.org/>

Craig Younkins
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/5ea6352a/attachment.html>

From ianb at colorstudy.com  Tue Jun 22 22:46:45 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 22 Jun 2010 15:46:45 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
Message-ID: <AANLkTik8nJRRgCT6nXHvgI0FwnBfJF955X5yN2Tuqw6p@mail.gmail.com>

On Tue, Jun 22, 2010 at 1:07 PM, James Y Knight <foom at fuhm.net> wrote:

> The surrogateescape method is a nice workaround for this, but I can't help
> thinking that it might've been better to just treat stuff as
> possibly-invalid-but-probably-utf8 byte-strings from input, through
> processing, to output. It seems kinda too late for that, though: next time
> someone designs a language, they can try that. :)
>

surrogateescape does help a lot, my only problem with it is that it's
out-of-band information.  That is, if you have data that went through
data.decode('utf8', 'surrogateescape') you can restore it to bytes or
transcode it to another encoding, but you have to know that it was decoded
specifically that way.  And of course if you did have to transcode it (e.g.,
text.encode('utf8', 'surrogateescape').decode('latin1')) then if you had
actually handled the text in any way you may have broken it; you don't
*really* have valid text.  A lazier solution feels like it would be easier
and more transparent to work with.

But... I also don't see any major language constraint to having another kind
of string that is bytes+encoding.  I think PJE brought up a problem with a
couple coercion aspects.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/094f8723/attachment.html>

From tjreedy at udel.edu  Tue Jun 22 23:21:53 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Tue, 22 Jun 2010 17:21:53 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvqorq$i69$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>	<hvijae$9tc$1@dough.gmane.org>	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
	<hvqorq$i69$1@dough.gmane.org>
Message-ID: <hvr9hj$l5b$1@dough.gmane.org>

Tres, I am a Python3 enthusiast and realist. I did not expect major 
adoption for about 3 years (more optimistic than the 5 years of some).

If you are feeling pressured to 'move' to Python3, it is not from me. I 
am sure you will do so on your own, perhaps even with enthusiasm, when 
it will be good for *you* to do so.

If someone wants to contribute while sticking to Python2, its easy. The 
tracker has perhaps 2000 open 2.x issues, hundreds with no responses. If 
more Python2 people worked on making 2.7 as bug-free as possible, the 
developers would be freer to make 3.2 as good as possible (which is what 
*I* want).

The porting of numpy (which I suspect has gotten some urging) will not 
just benefit 'nemerical' computing. For instance, there cannot be a 3.x 
version of pygame until there is a 3.x version of numpy, its main Python 
dependency. (The C Simple Directmedia Llibrary it also wraps and builds 
upon does not care.)

-- 
Terry Jan Reedy


From guido at python.org  Tue Jun 22 19:03:29 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 22 Jun 2010 10:03:29 -0700
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvqorq$i69$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com> 
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com> 
	<hvijae$9tc$1@dough.gmane.org>
	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com> 
	<hvqorq$i69$1@dough.gmane.org>
Message-ID: <AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>

On Tue, Jun 22, 2010 at 9:37 AM, Tres Seaver <tseaver at palladion.com> wrote:
> Any "turdiness" (which I am *not* arguing for) is a natural consequence
> of the kinds of backward incompatibilities which were *not* ruled out
> for Python 3, along with the (early, now waning) "build it and they will
> ?come" optimism about adoption rates.

FWIW, my optimisim is *not* waning. I think it's good that we're
having this discussion and I expect something useful will come out of
it; I also expect in general that the (admittedly serious) problem of
having to port all dependencies will be solved in the next few years.
Not by magic, but because many people are taking small steps in the
right direction, and there will be light eventually. In the mean time
I don't blame anyone for sticking with 2.x or being too busy to help
port stuff to 3.x. Python 3 has been a long time in the making -- it
will be a bit longer still, which was expected.

-- 
--Guido van Rossum (python.org/~guido)

From janssen at parc.com  Tue Jun 22 23:29:50 2010
From: janssen at parc.com (Bill Janssen)
Date: Tue, 22 Jun 2010 14:29:50 PDT
Subject: [Python-Dev] Use of cgi.escape can lead to XSS vulnerabilities
In-Reply-To: <AANLkTil9vEOPLgenyPOu-F3OuC-R61ak_KoEvNARrGvW@mail.gmail.com>
References: <AANLkTil9vEOPLgenyPOu-F3OuC-R61ak_KoEvNARrGvW@mail.gmail.com>
Message-ID: <10286.1277242190@parc.com>

Craig Younkins <cyounkins at gmail.com> wrote:

> cgi.escape never escapes single quote characters, which can easily lead to a
> Cross-Site Scripting (XSS) vulnerability. This seems to be known by many,
> but a quick search reveals many are using cgi.escape for HTML attribute
> escaping.

Did you file a bug report?

Bill

From robertc at robertcollins.net  Tue Jun 22 23:40:45 2010
From: robertc at robertcollins.net (Robert Collins)
Date: Wed, 23 Jun 2010 09:40:45 +1200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C20FC54.9000608@egenix.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<4C20FC54.9000608@egenix.com>
Message-ID: <AANLkTinBsCzx-et_w4pB_W2x-uueKtnhVzo8YYgHMUc9@mail.gmail.com>

On Wed, Jun 23, 2010 at 6:09 AM, M.-A. Lemburg <mal at egenix.com> wrote:

>> ? ? ? ? ? return constant.encode('utf-8')
>>
>> So now you can write x.split(literal_as('&', x)).
>
> This polymorphism is what we used in Python2 a lot to write
> code that works for both Unicode and 8-bit strings.
>
> Unfortunately, this no longer works as easily in Python3 due
> to the literals sometimes having the wrong type and using
> such a helper function slows things down a lot.

I didn't work in 2 either - see for instance the traceback module with
an Exception with unicode args and a non-ascii file path - the file
path is in its bytes form, the string joining logic triggers an
implicit upcast and *boom*.

> Too bad we can't add such porting enhancements to Python2 anymore

Perhaps a 'py3compat' module on pypi, with things like the py._builtin
reraise helper and so forth ?

-Rob

From martin at v.loewis.de  Tue Jun 22 23:50:49 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Tue, 22 Jun 2010 23:50:49 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTimrLwsklrQzBLbjf0LOCycp_gRa97gqc721NsNs@mail.gmail.com>
References: <73196.1277143019@parc.com>	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>	<75635.1277147585@parc.com>	<20100621212904.7bec83f6@pitrou.net>	<77297.1277150242@parc.com>	<1277150570.3369.1.camel@localhost.localdomain>	<4C1FC7E6.5070707@voidspace.org.uk>	<4C1FD5D6.7070007@v.loewis.de>	<4C1FD84B.3030202@voidspace.org.uk>	<4C1FDB65.4020503@v.loewis.de>	<4C1FDF1C.2060308@voidspace.org.uk>	<4C1FE4AF.80009@v.loewis.de>	<AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>	<EAE2C517-C5EA-4EF7-A4A0-286C3B08381D@mac.com>
	<AANLkTimrLwsklrQzBLbjf0LOCycp_gRa97gqc721NsNs@mail.gmail.com>
Message-ID: <4C213039.5090300@v.loewis.de>

> This effectively substitutes getgrouplist called on the current user
> for getgroups.  In 3.x, I believe the correct action will be to
> provide direct access to getgrouplist which is while not POSIX (yet?),
> is widely available.

As a policy, adding non-POSIX functions to the posix module is perfectly 
fine, as long as there is an autoconf test for it
(plain ifdefs are gruntingly accepted also).

Regards,
Martin

From fdrake at acm.org  Tue Jun 22 21:23:13 2010
From: fdrake at acm.org (Fred Drake)
Date: Tue, 22 Jun 2010 15:23:13 -0400
Subject: [Python-Dev] State of json in 2.7
In-Reply-To: <AANLkTilTBED-RkN4bTZc6pN6QqDcW6ShREnfP3Jh6zz9@mail.gmail.com>
References: <AANLkTilQjL07Z-aHgE7I_FGGnGTEt9ceX-eo29X1vmqE@mail.gmail.com> 
	<AANLkTilTBED-RkN4bTZc6pN6QqDcW6ShREnfP3Jh6zz9@mail.gmail.com>
Message-ID: <AANLkTim9GRCJR3tldJ_klAx9L6SSmTT_FU9e_WP_YZXl@mail.gmail.com>

On Tue, Jun 22, 2010 at 12:56 PM, Benjamin Peterson <benjamin at python.org> wrote:
> Never have externally maintained packages.

Seriously!  I concur with this.

Fortunately, it's not a real problem in this case.

There's the (maintained) simplejson package, and the unmaintained json
package.  And simplejson works with older versions of Python, too,
:-)


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"A storm broke loose in my mind."  --Albert Einstein

From ncoghlan at gmail.com  Tue Jun 22 23:41:51 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Jun 2010 07:41:51 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
Message-ID: <AANLkTikYmc0-46H9plWmSGIcgfw_fPM81ZG2EuxMh4r2@mail.gmail.com>

On Wed, Jun 23, 2010 at 2:17 AM, Guido van Rossum <guido at python.org> wrote:
> (1) Literals.
>
> If you write something like x.split('&') you are implicitly assuming x
> is text. I don't see a very clean way to overcome this; you'll have to
> implement some kind of type check e.g.
>
> ? ?x.split('&') if isinstance(x, str) else x.split(b'&')
>
> A handy helper function can be written:
>
> ?def literal_as(constant, variable):
> ? ? ?if isinstance(variable, str):
> ? ? ? ? ?return constant
> ? ? ?else:
> ? ? ? ? ?return constant.encode('utf-8')
>
> So now you can write x.split(literal_as('&', x)).

I think this is a key point. In checking the behaviour of the os
module bytes APIs (see below), I used a simple filter along the lines
of:

  [x for x in seq if x.endswith("b")]

It would be nice if code along those lines could easily be made polymorphic.

Maybe what we want is a new class method on bytes and str (this idea
is similar to what MAL suggests later in the thread):

  def coerce(cls, obj, encoding=None, errors='surrogateescape'):
    if isinstance(obj, cls):
        return existing
    if encoding is None:
        encoding = sys.getdefaultencoding()
    # This is the str version, bytes,coerce would use obj.encode() instead
    return obj.decode(encoding, errors)

Then my example above could be made polymorphic (for ASCII compatible
encodings) by writing:

  [x for x in seq if x.endswith(x.coerce("b"))]

I'm trying to see downsides to this idea, and I'm not really seeing
any (well, other than 2.7 being almost out the door and the fact we'd
have to grant ourselves an exception to the language moratorium)

> (2) Data sources.
>
> These can be functions that produce new data from non-string data,
> e.g. str(<int>), read it from a named file, etc. An example is read()
> vs. write(): it's easy to create a (hypothetical) polymorphic stream
> object that accepts both f.write('booh') and f.write(b'booh'); but you
> need some other hack to make read() return something that matches a
> desired return type. I don't have a generic suggestion for a solution;
> for streams in particular, the existing distinction between binary and
> text streams works, of course, but there are other situations where
> this doesn't generalize (I think some XML interfaces have this
> awkwardness in their API for converting a tree to a string).

We may need to use the os and io modules as the precedents here:

os: normal API is text using the surrogateescape error handler,
parallel bytes API exposes raw bytes. Parallel API is polymorphic if
possible (e.g. os.listdir), but appends a 'b' to the name if the
polymorphic approach isn't practical (e.g. os.environb, os.getcwdb,
os.getenvb).
io. layered API, where both the raw bytes of the wire protocol and the
decoded bytes of the text layer are available

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Wed Jun 23 00:07:07 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Jun 2010 08:07:07 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C20FC54.9000608@egenix.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<4C20FC54.9000608@egenix.com>
Message-ID: <AANLkTin2CNb-zb3BpjSNWVIQDLCVwUh6-FNRQd4XKAP7@mail.gmail.com>

On Wed, Jun 23, 2010 at 4:09 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> It would be great if we could have something like the above as
> builtin method:
>
> x.split('&'.as(x))

As per my other message, another possible (and reasonably intuitive)
spelling would be:

  x.split(x.coerce('&'))

Writing it as a helper function is also possible, although it be
trickier to remember the correct argument ordering:

  def coerce_to(target, obj, encoding=None, errors='surrogateescape'):
    if isinstance(obj, type(target)):
        return obj
    if encoding is None:
        encoding = sys.getdefaultencoding()
    try::
        convert = obj.decode
    except AttributeError:
        convert = obj.encode
    return convert(encoding, errors)

  x.split(coerce_to(x, '&'))

> Perhaps something to discuss on the language summit at EuroPython.
>
> Too bad we can't add such porting enhancements to Python2 anymore.

Well, we can if we really want to, it just entails convincing Benjamin
to reschedule the 2.7 final release. Given the UserDict/ABC/old-style
classes issue, there's a fair chance there's going to be at least one
more 2.7 RC anyway.

That said, since this kind of coercion can be done in a helper
function, that should be adequate for the 2.x to 3.x conversion case
(for 2.x, the helper function can be defined to just return the second
argument since bytes and str are the same type, while the 3.x version
would look something like the code above)

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From greg.ewing at canterbury.ac.nz  Wed Jun 23 01:03:06 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Wed, 23 Jun 2010 11:03:06 +1200
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>
	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
Message-ID: <4C21412A.9030709@canterbury.ac.nz>

Benjamin Peterson wrote:

> IIRC this was because UserDict tries to be a MutableMapping but abcs
> require new style classes.

Are there any use cases for UserList and UserDict in new
code, now that list and dict can be subclassed?

If not, I don't think it would be a big problem if they
were left out of the ABC ecosystem. No worse than what
happens to any other existing user-defined class that
predates ABCs -- if people want them to inherit from
ABCs, they have to update their code. In this case, the
update would consist of changing subclasses to inherit
from list or dict instead.

-- 
Greg

From fuzzyman at voidspace.org.uk  Wed Jun 23 00:59:12 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Tue, 22 Jun 2010 23:59:12 +0100
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <4C21412A.9030709@canterbury.ac.nz>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
	<4C21412A.9030709@canterbury.ac.nz>
Message-ID: <4C214040.20304@voidspace.org.uk>

On 23/06/2010 00:03, Greg Ewing wrote:
> Benjamin Peterson wrote:
>
>> IIRC this was because UserDict tries to be a MutableMapping but abcs
>> require new style classes.
>
> Are there any use cases for UserList and UserDict in new
> code, now that list and dict can be subclassed?

Inheriting from list or dict isn't very useful as you to have to 
override *every* method to control behaviour.

(For example with the dict if you override __setitem__ then update and 
setdefault (etc) don't go through your new __setitem__ and if you 
override __getitem__ then pop and friends don't go through your new 
__getitem__.)

In 2.6+ you can of course use the collections.MutableMapping abc, but if 
you want to write cross-Python version code UserDict is still useful. If 
you want abc support then you are *already* on 2.6+ though I guess.

All the best,

Michael

>
> If not, I don't think it would be a big problem if they
> were left out of the ABC ecosystem. No worse than what
> happens to any other existing user-defined class that
> predates ABCs -- if people want them to inherit from
> ABCs, they have to update their code. In this case, the
> update would consist of changing subclasses to inherit
> from list or dict instead.
>


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From fuzzyman at voidspace.org.uk  Wed Jun 23 01:04:15 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 23 Jun 2010 00:04:15 +0100
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTinBsCzx-et_w4pB_W2x-uueKtnhVzo8YYgHMUc9@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>	<4C20FC54.9000608@egenix.com>
	<AANLkTinBsCzx-et_w4pB_W2x-uueKtnhVzo8YYgHMUc9@mail.gmail.com>
Message-ID: <4C21416F.2040009@voidspace.org.uk>

On 22/06/2010 22:40, Robert Collins wrote:
> On Wed, Jun 23, 2010 at 6:09 AM, M.-A. Lemburg<mal at egenix.com>  wrote:
>
>    
>>>            return constant.encode('utf-8')
>>>
>>> So now you can write x.split(literal_as('&', x)).
>>>        
>> This polymorphism is what we used in Python2 a lot to write
>> code that works for both Unicode and 8-bit strings.
>>
>> Unfortunately, this no longer works as easily in Python3 due
>> to the literals sometimes having the wrong type and using
>> such a helper function slows things down a lot.
>>      
> I didn't work in 2 either - see for instance the traceback module with
> an Exception with unicode args and a non-ascii file path - the file
> path is in its bytes form, the string joining logic triggers an
> implicit upcast and *boom*.
>
>    
Yeah, there are still a few places in unittest where a unicode exception 
can cause the whole test run to bomb out. No-one has *yet* reported 
these as bugs and I try and ferret them out as I find them.

All the best,

Michael

>> Too bad we can't add such porting enhancements to Python2 anymore
>>      
> Perhaps a 'py3compat' module on pypi, with things like the py._builtin
> reraise helper and so forth ?
>
> -Rob
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From raymond.hettinger at gmail.com  Wed Jun 23 01:17:54 2010
From: raymond.hettinger at gmail.com (Raymond Hettinger)
Date: Tue, 22 Jun 2010 16:17:54 -0700
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <4C214040.20304@voidspace.org.uk>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com>	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com>
	<4C21412A.9030709@canterbury.ac.nz>
	<4C214040.20304@voidspace.org.uk>
Message-ID: <C0B4C6C6-959B-4D57-AE50-B11F87FD17C5@gmail.com>


On Jun 22, 2010, at 3:59 PM, Michael Foord wrote:

> On 23/06/2010 00:03, Greg Ewing wrote:
>> Benjamin Peterson wrote:
>> 
>>> IIRC this was because UserDict tries to be a MutableMapping but abcs
>>> require new style classes.
>> 
>> Are there any use cases for UserList and UserDict in new
>> code, now that list and dict can be subclassed?
> 
> Inheriting from list or dict isn't very useful as you to have to override *every* method to control behaviour.


Benjamin fixed the UserDict  and ABC problem earlier today in r82155.
It is now the same as it was in Py2.6.
Nothing to see here.
Move along.


Raymond

From fuzzyman at voidspace.org.uk  Wed Jun 23 01:18:29 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Wed, 23 Jun 2010 00:18:29 +0100
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
Message-ID: <4C2144C5.2070902@voidspace.org.uk>

On 22/06/2010 19:07, James Y Knight wrote:
>
> On Jun 22, 2010, at 1:03 PM, Ian Bicking wrote:
>> Similarly I'd expect (from experience) that a programmer using Python 
>> to want to take the same approach, sticking with unencoded data in 
>> nearly all situations.
>
> Yeah. This is a real issue I have with the direction Python3 went: it 
> pushes you into decoding everything to unicode early,

Well, both .NET and Java take this approach as well. I wonder how they 
cope with the particular issues that have been mentioned for web 
applications - both platforms are used extensively for web apps.

Having used IronPython, which has .NET unicode strings (although it does 
a lot of magic to *allow* you to store binary data in strings for 
compatibility with CPython),  I have to say that this approach makes a 
lot of programming *so* much more pleasant.

We did a lot of I/O (can you do useful programming without I/O?) 
including working with databases, but I didn't work *much* with wire 
protocols (fetching a fair bit of data from the web though now I think 
about it). I think wire protocols can present particular problems; 
sometimes having mixed encodings in the same data it seems. Where you 
don't have these problems keeping bytes data and all Unicode text data 
separate and encoding / decoding at the boundaries is really much more 
sane and pleasant.

It would be a real shame if we decided that the way forward for Python 3 
was to try and move closer to how bytes/text was handled in Python 2.

All the best,

Michael

> even when you don't care -- all you really wanted to do is pass it 
> from one API to another, with some well-defined transformations, which 
> don't actually depend on it having being decoded properly. (For 
> example, extracting the path from the URL and attempting to open it as 
> a file on the filesystem.)
>
> This means that Python3 programs can become *more* fragile in the face 
> of random data you encounter out in the real world, rather than less 
> fragile, which was the goal of the whole exercise.
>
> The surrogateescape method is a nice workaround for this, but I can't 
> help thinking that it might've been better to just treat stuff as 
> possibly-invalid-but-probably-utf8 byte-strings from input, through 
> processing, to output. It seems kinda too late for that, though: next 
> time someone designs a language, they can try that. :)
>
> James
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/69764944/attachment-0001.html>

From ianb at colorstudy.com  Wed Jun 23 01:23:40 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Tue, 22 Jun 2010 18:23:40 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
Message-ID: <AANLkTim_2Z6K5GqhnQ6rpGmN30WqSjBjLsHgNWyZKo3d@mail.gmail.com>

On Tue, Jun 22, 2010 at 11:17 AM, Guido van Rossum <guido at python.org> wrote:

> (2) Data sources.
>
> These can be functions that produce new data from non-string data,
> e.g. str(<int>), read it from a named file, etc. An example is read()
> vs. write(): it's easy to create a (hypothetical) polymorphic stream
> object that accepts both f.write('booh') and f.write(b'booh'); but you
> need some other hack to make read() return something that matches a
> desired return type. I don't have a generic suggestion for a solution;
> for streams in particular, the existing distinction between binary and
> text streams works, of course, but there are other situations where
> this doesn't generalize (I think some XML interfaces have this
> awkwardness in their API for converting a tree to a string).
>

This reminds me of the optimization ElementTree and lxml made in Python 2
(not sure what they do in Python 3?) where they use str when a string is
ASCII to avoid the memory and performance overhead of unicode.  Also at
least lxml is also dealing with the divide between the internal libxml2
string representation and the Python representation.  This is a place where
bytes+encoding might also have some benefit.  XML is someplace where you
might load a bunch of data but only touch a little bit of it, and the amount
of data is frequently large enough that the efficiencies are important.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/3c3abc69/attachment.html>

From pje at telecommunity.com  Wed Jun 23 01:55:11 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Tue, 22 Jun 2010 19:55:11 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTikYmc0-46H9plWmSGIcgfw_fPM81ZG2EuxMh4r2@mail.gmail.c
 om>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<AANLkTikYmc0-46H9plWmSGIcgfw_fPM81ZG2EuxMh4r2@mail.gmail.com>
Message-ID: <20100622235514.7B3FC3A4099@sparrow.telecommunity.com>

At 07:41 AM 6/23/2010 +1000, Nick Coghlan wrote:
>Then my example above could be made polymorphic (for ASCII compatible
>encodings) by writing:
>
>   [x for x in seq if x.endswith(x.coerce("b"))]
>
>I'm trying to see downsides to this idea, and I'm not really seeing
>any (well, other than 2.7 being almost out the door and the fact we'd
>have to grant ourselves an exception to the language moratorium)

Notice, however, that if multi-string operations used a coercion 
protocol (they currently have to do type checks already for 
byte/unicode mixes), then you could make the entire stdlib 
polymorphic by default, even for other kinds of strings that don't exist yet.

If you invent a new numeric type, generally speaking you can pass it 
to existing stdlib functions taking numbers, as long as it implements 
the appropriate protocols.  Why not do the same for strings?


From glyph at twistedmatrix.com  Wed Jun 23 02:23:56 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Tue, 22 Jun 2010 20:23:56 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTin2LbL1WOz_81WKC9FB2DfjEI36MpaJqbOfqhEl@mail.gmail.com>
References: <201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<39CFC9B3-E55A-41BB-9718-1457E20ACECC@twistedmatrix.com>
	<89AE7ED6-FB94-45DA-9432-7FCBA25A56BF@gmail.com>
	<AANLkTin2LbL1WOz_81WKC9FB2DfjEI36MpaJqbOfqhEl@mail.gmail.com>
Message-ID: <B8AD3FFF-4BE8-42A4-B8CD-CDEC396DDB5A@twistedmatrix.com>


On Jun 22, 2010, at 12:53 PM, Guido van Rossum wrote:

> On Mon, Jun 21, 2010 at 11:47 PM, Raymond Hettinger
> <raymond.hettinger at gmail.com> wrote:
>> 
>> On Jun 21, 2010, at 10:31 PM, Glyph Lefkowitz wrote:
>> 
>> This is a common pain-point for porting software to 3.x - you had a
>> string, it kinda worked most of the time before, but now you need to keep
>> track of text too and the functions which seemed to work on bytes no longer
>> do.
>> 
>> Thanks Glyph.  That is a nice summary of one kind of challenge facing
>> programmers.
> 
> Ironically, Glyph also described the pain in 2.x: it only "kinda" worked.

It was not my intention to be ironic about it - that was exactly what I meant :).  3.x is forcing you to confront an issue that you _should_ have confronted for 2.x anyway. 

(And, I hope, most libraries doing a 3.x migration will take the opportunity to make their 2.x APIs unicode-clean while still in 2to3 mode, and jump ship to 3.x source only _after_ there's a nice transition path for their clients that can be taken in 2 steps.)


From glyph at twistedmatrix.com  Wed Jun 23 02:25:31 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Tue, 22 Jun 2010 20:25:31 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
Message-ID: <94700B9C-25B4-4A75-BA43-20FEA3FDE772@twistedmatrix.com>


On Jun 22, 2010, at 2:07 PM, James Y Knight wrote:

> Yeah. This is a real issue I have with the direction Python3 went: it pushes you into decoding everything to unicode early, even when you don't care -- all you really wanted to do is pass it from one API to another, with some well-defined transformations, which don't actually depend on it having being decoded properly. (For example, extracting the path from the URL and attempting to open it as a file on the filesystem.)

But you _do_ need to decode it in this case.  If you got your URL from some funky UTF-32 datasource, b"\x00\x00\x00/" is not a path separator, "/" is.  Plus, you should really be separating path segments and looking at them individually so that you don't fall victim to "%2F" bugs.  And if you want your code to be portable, you need a Unicode representation of your pathname anyway for Windows; plus, there, you need to care about "\" as well as "/".

The fact that your wire-bytes were probably ASCII(-ish) and your filesystem probably encodes pathnames as UTF-8 and so everything looks like it lines up is no excuse not to be explicit about your expectations there.

You may want to transcode your characters into some other characters later, but that shouldn't stop you from treating them as characters of some variety in the meanwhile.

> The surrogateescape method is a nice workaround for this, but I can't help thinking that it might've been better to just treat stuff as possibly-invalid-but-probably-utf8 byte-strings from input, through processing, to output. It seems kinda too late for that, though: next time someone designs a language, they can try that. :)

I can think of lots of optimizations that might be interesting for Python (or perhaps some other runtime less concerned with cleverness overload, like PyPy) to implement, like a UTF-8 combining-characters overlay that would allow for fast indexing, lazily populated as random access dictates.  But this could all be implemented as smartness inside .encode() and .decode() and the str and bytes types without changing the way the API works.

I realize that there are implications at the C level, but as long as you can squeeze a function call in to certain places, it could still work.

I can also appreciate what's been said in this thread a bunch of times: to my knowledge, nobody has actually shown a profile of an application where encoding is significant overhead.  I believe that encoding _will_ be a significant overhead for some applications (and actually I think it will be very significant for some applications that I work on), but optimizations should really be implemented once that's been demonstrated, so that there's a better understanding of what the overhead is, exactly.  Is memory a big deal?  Is CPU?  Is it both?  Do you want to tune for the tradeoff?  etc, etc.  Clever data-structures seem premature until someone has a good idea of all those things.


From glyph at twistedmatrix.com  Wed Jun 23 02:34:31 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Tue, 22 Jun 2010 20:34:31 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim_2Z6K5GqhnQ6rpGmN30WqSjBjLsHgNWyZKo3d@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<AANLkTim_2Z6K5GqhnQ6rpGmN30WqSjBjLsHgNWyZKo3d@mail.gmail.com>
Message-ID: <5A4340BB-7B64-4C76-81FF-8A43F179AA7A@twistedmatrix.com>


On Jun 22, 2010, at 7:23 PM, Ian Bicking wrote:

> This is a place where bytes+encoding might also have some benefit.  XML is someplace where you might load a bunch of data but only touch a little bit of it, and the amount of data is frequently large enough that the efficiencies are important.

Different encodings have different characteristics, though, which makes them amenable to different types of optimizations.  If you've got an ASCII string or a latin1 string, the optimizations of unicode are pretty obvious; if you've got one in UTF-16 with no multi-code-unit sequences, you could also hypothetically cheat for a while if you're on a UCS4 build of Python.

I suspect the practical problem here is that there's no CharacterString ABC in the collections module for third-party libraries to provide their own peculiarly-optimized implementations that could lazily turn into real 'str's as needed.  I'd volunteer to write a PEP if I thought I could actually get it done :-\.  If someone else wants to be the primary author though, I'll try to help out.


From murman at gmail.com  Wed Jun 23 02:38:00 2010
From: murman at gmail.com (Michael Urman)
Date: Tue, 22 Jun 2010 19:38:00 -0500
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <hvr6l9$adb$1@dough.gmane.org>
References: <h3sa87mevl05p5ro18062010012216@SMTP>
	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>
	<201006201204.30795.steve@pearwood.info>
	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>
	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<AANLkTilB3yrlhWTPEAwqYRcaTMG4I-LET5efzVyMJfe9@mail.gmail.com>
	<AANLkTimxV86CcyTpqcGBlxx7in4ZUGR3fmwaSWk4jd_S@mail.gmail.com>
	<20100621015824.6A84E3A4099@sparrow.telecommunity.com>
	<AANLkTiluvzgBnAxYI9E5vXGLdQUuOhgqqnESEB_paG-F@mail.gmail.com>
	<20100621145133.7F5333A404D@sparrow.telecommunity.com>
	<AANLkTimtrTmvPRmRsxGeieVEdc-llCa-tB_nRgvgoPoZ@mail.gmail.com>
	<87lja73aau.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTinv0RpzrqtQiL3hN170-zICQQMcDUJJ_2Kb2NKH@mail.gmail.com>
	<hvr6l9$adb$1@dough.gmane.org>
Message-ID: <AANLkTin6byPDnRy21BSiUTrMz2HRew9hZYgYNj3DHMQ7@mail.gmail.com>

On Tue, Jun 22, 2010 at 15:32, Terry Reedy <tjreedy at udel.edu> wrote:
> On 6/22/2010 9:24 AM, Michael Urman wrote:
>> These are trivial functions;
>> I just don't fully understand why the capability isn't baked in.
>
> Possible reasons: They are special purpose functions easily built on the
> basic functions provided. Fine for a 3rd party library. Most people do not
> need them. Some might be mislead by them. As other have said, "Not every
> one-liner should be builtin".

Perhaps the two-argument constructions on bytes and str should have
been removed in favor of the .decode and .encode methods on their
respective classes. Or vice versa; I don't have the history to know in
which order they originated, and which is theoretically preferred
these days.

-- 
Michael Urman

From mike.klaas at gmail.com  Wed Jun 23 02:39:04 2010
From: mike.klaas at gmail.com (Mike Klaas)
Date: Tue, 22 Jun 2010 17:39:04 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim_2Z6K5GqhnQ6rpGmN30WqSjBjLsHgNWyZKo3d@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<AANLkTim_2Z6K5GqhnQ6rpGmN30WqSjBjLsHgNWyZKo3d@mail.gmail.com>
Message-ID: <AANLkTimpIoJ4YApQ_N-aHnITIu-5-UOk6RwXWJFalcNk@mail.gmail.com>

On Tue, Jun 22, 2010 at 4:23 PM, Ian Bicking <ianb at colorstudy.com> wrote:

> This reminds me of the optimization ElementTree and lxml made in Python 2
> (not sure what they do in Python 3?) where they use str when a string is
> ASCII to avoid the memory and performance overhead of unicode.

An optimization that forces me to typecheck the return value of the
function and that I only discovered after code started breaking.  I
can't say was enthused about that decision when I discovered it.

-Mike

From robertc at robertcollins.net  Wed Jun 23 02:57:48 2010
From: robertc at robertcollins.net (Robert Collins)
Date: Wed, 23 Jun 2010 12:57:48 +1200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <94700B9C-25B4-4A75-BA43-20FEA3FDE772@twistedmatrix.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<94700B9C-25B4-4A75-BA43-20FEA3FDE772@twistedmatrix.com>
Message-ID: <AANLkTim3Gsuh_idn4UG8AvvQTMlM00b2M-WxVhxoudnN@mail.gmail.com>

On Wed, Jun 23, 2010 at 12:25 PM, Glyph Lefkowitz
<glyph at twistedmatrix.com> wrote:
> I can also appreciate what's been said in this thread a bunch of times: to my knowledge, nobody has actually shown a profile of an application where encoding is significant overhead. ?I believe that encoding _will_ be a significant overhead for some applications (and actually I think it will be very significant for some applications that I work on), but optimizations should really be implemented once that's been demonstrated, so that there's a better understanding of what the overhead is, exactly. ?Is memory a big deal? ?Is CPU? ?Is it both? ?Do you want to tune for the tradeoff? ?etc, etc. ?Clever data-structures seem premature until someone has a good idea of all those things.

bzr has a cache of decoded strings in it precisely because decode is
slow. We accept slowness encoding to the users locale because thats
typically much less data to examine than we've examined while
generating the commit/diff/whatever. We also face memory pressure on a
regular basis, and that has been, at least partly, due to UCS4 - our
translation cache helps there because we have less duplicate UCS4
strings.

You're welcome to dig deeper into this, but I don't have more detail
paged into my head at the moment.

-Rob

From janssen at parc.com  Wed Jun 23 03:56:51 2010
From: janssen at parc.com (Bill Janssen)
Date: Tue, 22 Jun 2010 18:56:51 PDT
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <73196.1277143019@parc.com>
References: <73196.1277143019@parc.com>
Message-ID: <14929.1277258211@parc.com>

Bill Janssen <janssen at parc.com> wrote:

> Considering that we've just released 2.7rc2, there are an awful lot of
> red buildbots for 2.7.  In fact, I don't remember having seen a green
> buildbot for OS X and 2.7.  Shouldn't these be fixed?

Thanks to some action by Ronald, my two PPC OS X buildbots are now
showing green for the trunk.

Bill

From fdrake at acm.org  Wed Jun 23 03:58:07 2010
From: fdrake at acm.org (Fred Drake)
Date: Tue, 22 Jun 2010 21:58:07 -0400
Subject: [Python-Dev] UserDict in 2.7
In-Reply-To: <C0B4C6C6-959B-4D57-AE50-B11F87FD17C5@gmail.com>
References: <58CEF265-1B25-4FD6-9C45-88353A0AF0E7@gmail.com> 
	<AANLkTimNfT7aahv46EDyw85lQv2qnxielcLI205XYO5f@mail.gmail.com> 
	<4C21412A.9030709@canterbury.ac.nz> <4C214040.20304@voidspace.org.uk> 
	<C0B4C6C6-959B-4D57-AE50-B11F87FD17C5@gmail.com>
Message-ID: <AANLkTinEL1Rt53Rea07uyyQBRJ0xCDnmXjAdY8XKTdWA@mail.gmail.com>

On Tue, Jun 22, 2010 at 7:17 PM, Raymond Hettinger
<raymond.hettinger at gmail.com> wrote:
> Benjamin fixed the UserDict ?and ABC problem earlier today in r82155.
> It is now the same as it was in Py2.6.

Thanks, Benjamin!


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"A storm broke loose in my mind."  --Albert Einstein

From stephen at xemacs.org  Wed Jun 23 08:44:28 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 23 Jun 2010 15:44:28 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
Message-ID: <871vbyp7sj.fsf@uwakimon.sk.tsukuba.ac.jp>

Ian Bicking writes:

 > Just for perspective, I don't know if I've ever wanted to deal with a URL
 > like that.

Ditto, I do many times a day for Japanese media sites and Wikipedia.

 > I know how it is supposed to work, and I know what a browser does
 > with that, but so many tools will clean that URL up *or* won't be
 > able to deal with it at all that it's not something I'll be passing
 > around.

I'm not suggesting that is something you want to be "passing around";
it's a presentation form, and I prefer that the internal form use
Unicode.

 > While it's nice to be correct about encodings, sometimes it is
 > impractical.  And it is far nicer to avoid the situation entirely.

But you cannot avoid it entirely.  Processing bytes mean you are
assuming ASCII compatibility.  Granted, this is a pretty good
assumption, especially if you got the bytes off the wire, but it's not
universally so.

Maybe it's a YAGNI, but one reason I prefer the decode-process-encode
paradigm is that choice of codec is a specification of the assumptions
you're making about encoding.  So the Know-Nothing codec described
above assumes just enough ASCII compatibility to parse the scheme.
You could also have codecs which assume just enough ASCII
compatibility to parse a hierarchical scheme, etc.

 > That is, decoding content you don't care about isn't just
 > inefficient, it's complicated and can introduce errors.

That depends on the codec(s) used.

 > Similarly I'd expect (from experience) that a programmer using
 > Python to want to take the same approach, sticking with unencoded
 > data in nearly all situations.

Indeed, a programmer using Python 2 would want to do so, because all
her literal strings are bytes by default (ie, if she doesn't mark them
with `u'), and interactive input is, too.  This is no longer so
obvious in Python 3 which takes the attitude that things that are
expected to be human-readable should be processed as str.  The obvious
example in URI space is the file:/// URL, which you'll typically build
up from a user string or a file browser, which will call the os.path
stuff which returns str.

Text editors and viewers will also use str for their buffers, and if
they provide a way to fish out URIs for their users, they'll probably
return str.

I won't pretend to judge the relative importance of such use cases.
But use cases for urllib which naturally favor str until you put the
URI on the wire do exist, as does the debugging presentation aspect.


From ronaldoussoren at mac.com  Wed Jun 23 08:08:13 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 23 Jun 2010 08:08:13 +0200
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTimrLwsklrQzBLbjf0LOCycp_gRa97gqc721NsNs@mail.gmail.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk> <4C1FE4AF.80009@v.loewis.de>
	<AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>
	<EAE2C517-C5EA-4EF7-A4A0-286C3B08381D@mac.com>
	<AANLkTimrLwsklrQzBLbjf0LOCycp_gRa97gqc721NsNs@mail.gmail.com>
Message-ID: <B2467B5D-5751-4557-9488-356D5F95660B@mac.com>


On 22 Jun, 2010, at 19:05, Alexander Belopolsky wrote:

> On Tue, Jun 22, 2010 at 12:39 PM, Ronald Oussoren
> <ronaldoussoren at mac.com> wrote:
> ..
>> Both are valid fixes, both have both advantages and disadvantages.
>> 
>> Your proposal:
>> * Reverts to the behavior in 2.6
>> * Ensures that posix.getgroups and posix.setgroups are internally consistent
>> 
> It is also very simple and since posix module worked fine on OSX for
> years without _DARWIN_C_SOURCE, I think this is a very low risk
> change.

I don't agree.  The patch itself is pretty simple, but it does make a rather significant change to the build process: the compile-time environment in configure would be different than during the compilation of posixmodule. That is, in functions that check for features (the HAVE_FOOBAR macros in pyconfig.h) would use _DARWIN_C_SOURCE while posixmodule itself wouldn't.    This may lead to subtle bugs, or even compile errors (because some function definitions change when _DARWIN_C_SOURCE active).

And man compat(5) says:

<quote>
32-BIT COMPILATION
     Defining _NONSTD_SOURCE causes library and kernel calls to behave as closely to Mac OS X 10.3's library and kernel calls as possible.  Any
     behavioral changes in this mode are documented in the LEGACY sections of the individual function calls.

     Defining _POSIX_C_SOURCE or _DARWIN_C_SOURCE causes library and kernel calls to conform to the SUSv3 standards even if doing so would alter
     the behavior of functions used in 10.3.  Defining _POSIX_C_SOURCE also removes functions, types, and other interfaces that are not part of
     SUSv3 from the normal C namespace, unless _DARWIN_C_SOURCE is also defined (i.e., _DARWIN_C_SOURCE is _POSIX_C_SOURCE with non-POSIX exten-
     sions).  In any of these cases, the _DARWIN_FEATURE_UNIX_CONFORMANCE feature macro will be defined to the SUS conformance level (it is unde-
     fined otherwise).

     Starting in Mac OS X 10.5, if none of the macros _NONSTD_SOURCE, _POSIX_C_SOURCE or _DARWIN_C_SOURCE are defined, and the environment vari-
     able MACOSX_DEPLOYMENT_TARGET is either undefined or set to 10.5 or greater (or equivalently, the gcc(1) option -mmacosx-version-min is
     either not specified or set to 10.5 or greater), then UNIX conformance will be on by default, and non-POSIX extensions will also be available
     (this is the equivalent of defining _DARWIN_C_SOURCE).  For version values less that 10.5, UNIX conformance will be off (the equivalent of
     defining _NONSTD_SOURCE).
</quote>

My interpretation of that is that _DARWIN_C_SOURCE should be used to get SUSv3 APIs while keeping access to darwin-specific API's at well. When you deploy to 10.5 or later the compiler will set _DARWIN_C_SOURCE for you unless you set one of the other feature selecting defines.

> 
>> My proposal:
>> * Uses the newer ABI, which is more likely to be the one Apple wants you to use
> 
> I don't think so.  In getgroups(2) I see
> 
> LEGACY DESCRIPTION
>     If _DARWIN_C_SOURCE is defined, getgroups() can return more than
> {NGROUPS_MAX} groups.
> 
> This suggests that this is legacy behavior.  Newer applications should
> use getgrouplist instead.

I honestly don't know why this is in the LEGACY DESCRIPTION.     But as the functionality you get with _DARWIN_C_SOURCE was added later I'd say that the behavior is intentional and not legacy.     By not definining _DARWIN_C_SOURCE we don't necessarily get full UNIX behavior for other APIs.

> 
>> * Is compatible with system tools (that is, posix.getgroups() agrees with id(1))
> 
> I have not tested this recently, but I think if you exec id from a
> program after a call to setgroups(), it will return process groups,
> not user groups.
> 
>> * Is compatible with /usr/bin/python
> 
> I am sure that one this issue is fixed upstream, Apple will pick it up
> with the next version.

Haha.  Apple explicitly added patches to get the current behavior instead of the default, what makes you think that they'll revert to the older behavior.

> 
>> * results in posix.getgroups not reflecting results of posix.setgroups
>> 
> 
> This effectively substitutes getgrouplist called on the current user
> for getgroups.  In 3.x, I believe the correct action will be to
> provide direct access to getgrouplist which is while not POSIX (yet?),
> is widely available.

I don't mind adding getgrouplist, but that issue is seperator from this one. BTW. Appearently getgrouplist is posix (<http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/libc.html>), although this isn't a requirement for being added to the posix module.


It is still my opinion that the second option is preferable for better compatibility with system tools, even if the patch is more complicated and the library function we use can be considered to be broken.

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/557c57aa/attachment-0001.bin>

From stephen at xemacs.org  Wed Jun 23 09:07:50 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 23 Jun 2010 16:07:50 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
Message-ID: <87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>

James Y Knight writes:

 > The surrogateescape method is a nice workaround for this, but I can't  
 > help thinking that it might've been better to just treat stuff as  
 > possibly-invalid-but-probably-utf8 byte-strings from input, through  
 > processing, to output.

This is the world we already have, modulo s/utf8/ascii + random GR
charset/.  It doesn't work, and it can't, in Japan or China or Korea,
and probably not in Russia or Kazakhstan, for some time yet.

That's not to say that byte-oriented processing doesn't have its
place.  And in many cases it's reasonable (but not secure or
bulletproof!) to assume ASCII compatibility of the byte stream,
passing through syntactically unimportant bytes verbatim.  Syntactic
analysis of such streams will surely have a lot in common with that
for text streams, so the same tools should be available.  (That's the
point of Guido's endorsement of polymorphism, AIUI.)

But it's just not reasonable to assume that will work in a context
where text streams from various sources are mixed with byte streams.
In that case, the byte streams need to be converted to text before
mixing.  (You can't do it the other way around because there is no
guarantee that the text is compatible with the current encoding of the
byte stream, nor that all the byte streams have the same encoding.)

We do need str-based implementations of modules like urllib.

From mal at egenix.com  Wed Jun 23 11:18:23 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 23 Jun 2010 11:18:23 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTin2CNb-zb3BpjSNWVIQDLCVwUh6-FNRQd4XKAP7@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>	<4C20FC54.9000608@egenix.com>
	<AANLkTin2CNb-zb3BpjSNWVIQDLCVwUh6-FNRQd4XKAP7@mail.gmail.com>
Message-ID: <4C21D15F.8070304@egenix.com>

Nick Coghlan wrote:
> On Wed, Jun 23, 2010 at 4:09 AM, M.-A. Lemburg <mal at egenix.com> wrote:
>> It would be great if we could have something like the above as
>> builtin method:
>>
>> x.split('&'.as(x))
> 
> As per my other message, another possible (and reasonably intuitive)
> spelling would be:
> 
>   x.split(x.coerce('&'))

You are right: there are two ways to adapt one object to another.
You can either adapt object 1 to object 2 or object 2 to object 1.
This is what the Python2 coercion protocol does for operators.
I just wanted to avoid using that term, since Python3 removes
the coercion protocol.

> Writing it as a helper function is also possible, although it be
> trickier to remember the correct argument ordering:
> 
>   def coerce_to(target, obj, encoding=None, errors='surrogateescape'):
>     if isinstance(obj, type(target)):
>         return obj
>     if encoding is None:
>         encoding = sys.getdefaultencoding()
>     try::
>         convert = obj.decode
>     except AttributeError:
>         convert = obj.encode
>     return convert(encoding, errors)
> 
>   x.split(coerce_to(x, '&'))
> 
>> Perhaps something to discuss on the language summit at EuroPython.
>>
>> Too bad we can't add such porting enhancements to Python2 anymore.
> 
> Well, we can if we really want to, it just entails convincing Benjamin
> to reschedule the 2.7 final release. Given the UserDict/ABC/old-style
> classes issue, there's a fair chance there's going to be at least one
> more 2.7 RC anyway.
> 
> That said, since this kind of coercion can be done in a helper
> function, that should be adequate for the 2.x to 3.x conversion case
> (for 2.x, the helper function can be defined to just return the second
> argument since bytes and str are the same type, while the 3.x version
> would look something like the code above)

True.

Note that the point of using a builtin method was to get
better performance. Such type adaptions are often needed in
loops, so adding a few extra Python function calls just to
convert a str object to a bytes object or vice-versa is a
bit much overhead.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 23 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                25 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From cesare.di.mauro at gmail.com  Wed Jun 23 12:12:36 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Wed, 23 Jun 2010 12:12:36 +0200
Subject: [Python-Dev] WPython 1.1 was released
Message-ID: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>

I've released WPython 1.1, which brings many optimizations and refactorings.

The project is hosted at Google Code: http://code.google.com/p/wpython2/ and
available as a Mercurial repository
http://code.google.com/p/wpython2/source/checkout?repo=wpython11 .

In the download section
http://code.google.com/p/wpython2/downloads/listthere are the slides
of the last italian PyCon where I have presented the
project and illustrated the changes.

You can also download the binaries for Windows (compressed in 7-Zip format:
http://www.7-zip.org/ ) and sources (for Unix users, Parser/Python.asdl and
configure files need to be chmod +x ).

Attached there are some benchmarks with the Unladen Swallow tests suite
(against Python 2.6.4).

Regards
Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/9865c662/attachment-0001.html>
-------------- next part --------------
?Report on Darwin iMac-di-Mirco.local 10.3.0 Darwin Kernel Version 10.3.0: Fri Feb 26 11:57:13 PST 2010; root:xnu-1504.3.12~1/RELEASE_X86_64 x86_64 i386
Total CPU cores: 2

### 2to3 ###
29.085133 -> 25.601404: 1.1361x faster

### bzr_startup ###
Min: 0.204419 -> 0.096856: 2.1105x faster
Avg: 0.213686 -> 0.113666: 1.8799x faster
Significant (t=71.767819)
Stddev: 0.01277 -> 0.00559: 2.2833x smaller
Timeline: http://tinyurl.com/y7qgndp

### call_method ###
Min: 0.644754 -> 0.622001: 1.0366x faster
Avg: 0.806862 -> 0.725472: 1.1122x faster
Significant (t=11.301638)
Stddev: 0.07300 -> 0.04951: 1.4744x smaller
Timeline: http://tinyurl.com/y3rfsnu

### call_method_slots ###
Min: 0.626559 -> 0.589525: 1.0628x faster
Avg: 0.761122 -> 0.680558: 1.1184x faster
Significant (t=11.706336)
Stddev: 0.06496 -> 0.05371: 1.2093x smaller
Timeline: http://tinyurl.com/y7kkg9m

### call_method_unknown ###
Min: 0.669814 -> 0.593711: 1.1282x faster
Avg: 0.883463 -> 0.746100: 1.1841x faster
Significant (t=8.601215)
Stddev: 0.13619 -> 0.14039: 1.0308x larger
Timeline: http://tinyurl.com/y6u5qut

### call_simple ###
Min: 0.486911 -> 0.435191: 1.1188x faster
Avg: 0.700634 -> 0.590928: 1.1857x faster
Significant (t=9.030587)
Stddev: 0.12218 -> 0.08491: 1.4390x smaller
Timeline: http://tinyurl.com/y2pnbfz

### float ###
Min: 0.126226 -> 0.097072: 1.3003x faster
Avg: 0.174486 -> 0.164656: 1.0597x faster
Significant (t=2.822244)
Stddev: 0.02922 -> 0.04668: 1.5976x larger
Timeline: http://tinyurl.com/y3o7gko

### hg_startup ###
Min: 0.057444 -> 0.042930: 1.3381x faster
Avg: 0.067769 -> 0.050515: 1.3416x faster
Significant (t=109.019677)
Stddev: 0.00293 -> 0.00199: 1.4687x smaller
Timeline: http://tinyurl.com/y5ss3l9

### html5lib ###
Min: 16.410586 -> 15.971322: 1.0275x faster
Avg: 16.579096 -> 16.119135: 1.0285x faster
Significant (t=5.554462)
Stddev: 0.13844 -> 0.12297: 1.1258x smaller
Timeline: http://tinyurl.com/yya44oj

### html5lib_warmup ###
Min: 17.765242 -> 15.582871: 1.1400x faster
Avg: 17.968972 -> 16.065290: 1.1185x faster
Significant (t=10.236030)
Stddev: 0.28980 -> 0.29826: 1.0292x larger
Timeline: http://tinyurl.com/y7osmkp

### iterative_count ###
Min: 0.156827 -> 0.084917: 1.8468x faster
Avg: 0.166389 -> 0.090218: 1.8443x faster
Significant (t=26.855602)
Stddev: 0.01766 -> 0.00950: 1.8586x smaller
Timeline: http://tinyurl.com/y2kz25f

### nbody ###
Min: 0.498760 -> 0.427710: 1.1661x faster
Avg: 0.515754 -> 0.445318: 1.1582x faster
Significant (t=22.964790)
Stddev: 0.01500 -> 0.01566: 1.0442x larger
Timeline: http://tinyurl.com/y7b92bm

### normal_startup ###
Min: 0.534059 -> 0.817747: 1.5312x slower
Avg: 0.547493 -> 0.838141: 1.5309x slower
Significant (t=-127.297104)
Stddev: 0.00799 -> 0.01403: 1.7567x larger
Timeline: http://tinyurl.com/y5tfkm3

### nqueens ###
Min: 0.583106 -> 0.573619: 1.0165x faster
Avg: 0.611182 -> 0.595222: 1.0268x faster
Significant (t=3.893252)
Stddev: 0.02367 -> 0.01674: 1.4142x smaller
Timeline: http://tinyurl.com/y79zhpz

### pickle ###
Min: 1.660705 -> 1.576223: 1.0536x faster
Avg: 1.757750 -> 1.672262: 1.0511x faster
Significant (t=9.284162)
Stddev: 0.04774 -> 0.04427: 1.0785x smaller
Timeline: http://tinyurl.com/y2f3eee

### pickle_dict ###
Min: 1.389026 -> 1.468648: 1.0573x slower
Avg: 1.479180 -> 1.551554: 1.0489x slower
Significant (t=-7.056610)
Stddev: 0.05664 -> 0.04529: 1.2507x smaller
Timeline: http://tinyurl.com/y2kl4no

### pickle_list ###
Min: 0.802236 -> 0.780976: 1.0272x faster
Avg: 0.843450 -> 0.822717: 1.0252x faster
Significant (t=3.353898)
Stddev: 0.02861 -> 0.03305: 1.1554x larger
Timeline: http://tinyurl.com/y2csxb9

### pybench ###
Min: 4906 -> 4344: 1.1294x faster
Avg: 5235 -> 4618: 1.1336x faster

### regex_compile ###
Min: 0.757385 -> 0.663902: 1.1408x faster
Avg: 0.807480 -> 0.698190: 1.1565x faster
Significant (t=20.304562)
Stddev: 0.03027 -> 0.02308: 1.3116x smaller
Timeline: http://tinyurl.com/y5vmu5y

### regex_effbot ###
Min: 0.102901 -> 0.095138: 1.0816x faster
Avg: 0.109344 -> 0.102460: 1.0672x faster
Significant (t=5.515715)
Stddev: 0.00574 -> 0.00670: 1.1678x larger
Timeline: http://tinyurl.com/yyhbuzh

### regex_v8 ###
Min: 0.123948 -> 0.106031: 1.1690x faster
Avg: 0.128534 -> 0.111830: 1.1494x faster
Significant (t=16.677634)
Stddev: 0.00436 -> 0.00558: 1.2787x larger
Timeline: http://tinyurl.com/y2zrssn

### richards ###
Min: 0.354665 -> 0.287113: 1.2353x faster
Avg: 0.381205 -> 0.306374: 1.2442x faster
Significant (t=23.417400)
Stddev: 0.01926 -> 0.01182: 1.6294x smaller
Timeline: http://tinyurl.com/yyzqb7v

### slowpickle ###
Min: 0.753230 -> 0.664495: 1.1335x faster
Avg: 0.801162 -> 0.708291: 1.1311x faster
Significant (t=17.994391)
Stddev: 0.02267 -> 0.02860: 1.2612x larger
Timeline: http://tinyurl.com/y4z6poh

### slowspitfire ###
Min: 0.868708 -> 0.872393: 1.0042x slower
Avg: 0.971014 -> 0.919428: 1.0561x faster
Significant (t=4.503573)
Stddev: 0.07780 -> 0.02253: 3.4529x smaller
Timeline: http://tinyurl.com/y64sn8p

### slowunpickle ###
Min: 0.337317 -> 0.299357: 1.1268x faster
Avg: 0.353311 -> 0.313929: 1.1254x faster
Significant (t=19.034627)
Stddev: 0.01066 -> 0.01002: 1.0629x smaller
Timeline: http://tinyurl.com/y3symau

### startup_nosite ###
Min: 0.317232 -> 0.224719: 1.4117x faster
Avg: 0.333151 -> 0.235118: 1.4170x faster
Significant (t=95.671333)
Stddev: 0.00851 -> 0.00571: 1.4919x smaller
Timeline: http://tinyurl.com/yyvr8m5

### threaded_count ###
Min: 0.194147 -> 0.116080: 1.6725x faster
Avg: 0.216559 -> 0.139140: 1.5564x faster
Significant (t=50.972602)
Stddev: 0.00765 -> 0.00753: 1.0162x smaller
Timeline: http://tinyurl.com/y38bz5h

### unpack_sequence ###
Min: 0.000093 -> 0.000082: 1.1337x faster
Avg: 0.000098 -> 0.000086: 1.1343x faster
Significant (t=25.434812)
Stddev: 0.00007 -> 0.00008: 1.1129x larger
Timeline: http://tinyurl.com/y5hv9ck

### unpickle ###
Min: 1.102754 -> 1.015811: 1.0856x faster
Avg: 1.138448 -> 1.052802: 1.0814x faster
Significant (t=18.018135)
Stddev: 0.02248 -> 0.02499: 1.1118x larger
Timeline: http://tinyurl.com/y49x4pk

### unpickle_list ###
Min: 0.990238 -> 0.881112: 1.1239x faster
Avg: 1.043900 -> 0.933968: 1.1177x faster
Significant (t=21.205782)
Stddev: 0.02977 -> 0.02139: 1.3913x smaller
Timeline: http://tinyurl.com/y49pm9p


Report on Linux cionci-desktop 2.6.27-17-generic #1 SMP Fri Mar 12 02:08:25 UTC 2010 x86_64 
Total CPU cores: 2

### 2to3 ###
27.729733 -> 25.521595: 1.0865x faster

### bzr_startup ###
Min: 0.072004 -> 0.068004: 1.0588x faster
Avg: 0.094326 -> 0.091926: 1.0261x faster
Not significant
Stddev: 0.00883 -> 0.00958: 1.0851x larger
Timeline: http://tinyurl.com/y5zc5ca

### call_method ###
Min: 0.630349 -> 0.566228: 1.1132x faster
Avg: 0.655913 -> 0.574280: 1.1421x faster
Significant (t=54.712328)
Stddev: 0.01462 -> 0.01096: 1.3344x smaller
Timeline: http://tinyurl.com/y6eg77c

### call_method_slots ###
Min: 0.635804 -> 0.511669: 1.2426x faster
Avg: 0.660014 -> 0.528936: 1.2478x faster
Significant (t=69.342882)
Stddev: 0.01859 -> 0.01380: 1.3470x smaller
Timeline: http://tinyurl.com/y7p9esb

### call_method_unknown ###
Min: 0.766309 -> 0.562713: 1.3618x faster
Avg: 0.774030 -> 0.585773: 1.3214x faster
Significant (t=90.713925)
Stddev: 0.00759 -> 0.02426: 3.1937x larger
Timeline: http://tinyurl.com/y6y6w7a

### call_simple ###
Min: 0.498106 -> 0.451661: 1.1028x faster
Avg: 0.502283 -> 0.460072: 1.0917x faster
Significant (t=62.530336)
Stddev: 0.00738 -> 0.00373: 1.9763x smaller
Timeline: http://tinyurl.com/y5gt8qa

### float ###
Min: 0.117934 -> 0.102821: 1.1470x faster
Avg: 0.129057 -> 0.117482: 1.0985x faster
Significant (t=12.577691)
Stddev: 0.00811 -> 0.01208: 1.4897x larger
Timeline: http://tinyurl.com/y2pc4wj

### hg_startup ###
Min: 0.012000 -> 0.012001: 1.0001x slower
Avg: 0.033594 -> 0.032258: 1.0414x faster
Significant (t=3.596547)
Stddev: 0.00597 -> 0.00578: 1.0320x smaller
Timeline: http://tinyurl.com/y449a8r

### html5lib ###
Min: 16.581036 -> 15.668980: 1.0582x faster
Avg: 16.823451 -> 15.946597: 1.0550x faster
Significant (t=4.738181)
Stddev: 0.22787 -> 0.34542: 1.5159x larger
Timeline: http://tinyurl.com/y3wx52k

### html5lib_warmup ###
Min: 16.436294 -> 15.664941: 1.0492x faster
Avg: 16.810495 -> 15.983748: 1.0517x faster
Significant (t=2.827967)
Stddev: 0.43953 -> 0.48388: 1.1009x larger
Timeline: http://tinyurl.com/y74vue8

### iterative_count ###
Min: 0.189088 -> 0.083317: 2.2695x faster
Avg: 0.191612 -> 0.088073: 2.1756x faster
Significant (t=65.385891)
Stddev: 0.00501 -> 0.01001: 1.9975x larger
Timeline: http://tinyurl.com/y65yy5c

### nbody ###
Min: 0.568523 -> 0.426052: 1.3344x faster
Avg: 0.580190 -> 0.428620: 1.3536x faster
Significant (t=72.626477)
Stddev: 0.01450 -> 0.00273: 5.3178x smaller
Timeline: http://tinyurl.com/y5hbwsy

### normal_startup ###
Min: 0.420100 -> 0.408876: 1.0275x faster
Avg: 0.475876 -> 0.489076: 1.0277x slower
Not significant
Stddev: 0.04082 -> 0.05543: 1.3579x larger
Timeline: http://tinyurl.com/y5jdfgq

### nqueens ###
Min: 0.585605 -> 0.577289: 1.0144x faster
Avg: 0.603038 -> 0.594904: 1.0137x faster
Significant (t=2.026307)
Stddev: 0.01851 -> 0.02152: 1.1629x larger
Timeline: http://tinyurl.com/yydzdhw

### pickle ###
Min: 1.592286 -> 1.584492: 1.0049x faster
Avg: 1.611001 -> 1.606726: 1.0027x faster
Not significant
Stddev: 0.01343 -> 0.03570: 2.6586x larger
Timeline: http://tinyurl.com/yyax7wc

### pickle_dict ###
Min: 1.316577 -> 1.298239: 1.0141x faster
Avg: 1.320249 -> 1.311228: 1.0069x faster
Significant (t=3.270732)
Stddev: 0.00367 -> 0.01915: 5.2196x larger
Timeline: http://tinyurl.com/y2smb8n

### pickle_list ###
Min: 0.734164 -> 0.727414: 1.0093x faster
Avg: 0.749225 -> 0.738023: 1.0152x faster
Significant (t=3.523434)
Stddev: 0.01996 -> 0.01035: 1.9291x smaller
Timeline: http://tinyurl.com/yybbuct

### pybench ###
Min: 5133 -> 4264: 1.2038x faster
Avg: 5370 -> 4448: 1.2073x faster

### regex_compile ###
Min: 0.783521 -> 0.706420: 1.1091x faster
Avg: 0.805385 -> 0.743189: 1.0837x faster
Significant (t=14.697890)
Stddev: 0.01900 -> 0.02312: 1.2168x larger
Timeline: http://tinyurl.com/y4ng9oz

### regex_effbot ###
Min: 0.106946 -> 0.108064: 1.0105x slower
Avg: 0.108937 -> 0.112714: 1.0347x slower
Significant (t=-4.189386)
Stddev: 0.00158 -> 0.00618: 3.9173x larger
Timeline: http://tinyurl.com/y2xs6yp

### regex_v8 ###
Min: 0.114305 -> 0.110961: 1.0301x faster
Avg: 0.119100 -> 0.113885: 1.0458x faster
Significant (t=6.210478)
Stddev: 0.00525 -> 0.00278: 1.8876x smaller
Timeline: http://tinyurl.com/y5q2nlh

### richards ###
Min: 0.376030 -> 0.309641: 1.2144x faster
Avg: 0.389031 -> 0.314998: 1.2350x faster
Significant (t=29.499544)
Stddev: 0.01745 -> 0.00325: 5.3669x smaller
Timeline: http://tinyurl.com/y5rh4av

### slowpickle ###
Min: 0.800369 -> 0.711095: 1.1255x faster
Avg: 0.824734 -> 0.735770: 1.1209x faster
Significant (t=19.434640)
Stddev: 0.02554 -> 0.01989: 1.2842x smaller
Timeline: http://tinyurl.com/y79lh35

### slowspitfire ###
Min: 0.813913 -> 0.761560: 1.0687x faster
Avg: 0.829754 -> 0.841118: 1.0137x slower
Not significant
Stddev: 0.01202 -> 0.05522: 4.5958x larger
Timeline: http://tinyurl.com/y4y6f4x

### slowunpickle ###
Min: 0.369238 -> 0.296829: 1.2439x faster
Avg: 0.384044 -> 0.300151: 1.2795x faster
Significant (t=32.788791)
Stddev: 0.01766 -> 0.00391: 4.5186x smaller
Timeline: http://tinyurl.com/y84c2bp

### startup_nosite ###
Min: 0.173227 -> 0.183291: 1.0581x slower
Avg: 0.234029 -> 0.235226: 1.0051x slower
Not significant
Stddev: 0.02222 -> 0.01951: 1.1389x smaller
Timeline: http://tinyurl.com/y2esfmd

### threaded_count ###
Min: 0.203453 -> 0.084667: 2.4030x faster
Avg: 0.263979 -> 0.105661: 2.4984x faster
Significant (t=26.001645)
Stddev: 0.03833 -> 0.01960: 1.9552x smaller
Timeline: http://tinyurl.com/y74qvbf

### unpack_sequence ###
Min: 0.000116 -> 0.000108: 1.0728x faster
Avg: 0.000121 -> 0.000118: 1.0261x faster
Significant (t=13.346440)
Stddev: 0.00004 -> 0.00004: 1.0544x larger
Timeline: http://tinyurl.com/y6rld7k

### unpickle ###
Min: 0.919231 -> 0.922668: 1.0037x slower
Avg: 0.936096 -> 0.947798: 1.0125x slower
Significant (t=-3.379601)
Stddev: 0.01505 -> 0.01931: 1.2834x larger
Timeline: http://tinyurl.com/y3ymn85

### unpickle_list ###
Min: 0.690399 -> 0.690025: 1.0005x faster
Avg: 0.729519 -> 0.698789: 1.0440x faster
Significant (t=11.660568)
Stddev: 0.01430 -> 0.01195: 1.1965x smaller
Timeline: http://tinyurl.com/y38lfuh


Report on Linux sauron 2.6.33-ARCH #1 SMP PREEMPT Sun Apr 4 10:27:30 CEST 2010 x86_64 AMD Athlon(tm) 64 X2 Dual Core Processor 5200+
Total CPU cores: 2

### 2to3 ###
29.598071 -> 23.691789: 1.2493x faster

### bzr_startup ###
Min: 0.083328 -> 0.076661: 1.0870x faster
Avg: 0.100727 -> 0.094061: 1.0709x faster
Significant (t=5.464159)
Stddev: 0.00863 -> 0.00863: 1.0000x larger
Timeline: http://tinyurl.com/y6mng7k

### call_method ###
Min: 0.796609 -> 0.538237: 1.4800x faster
Avg: 0.816184 -> 0.547101: 1.4918x faster
Significant (t=92.212665)
Stddev: 0.03177 -> 0.01636: 1.9417x smaller
Timeline: http://tinyurl.com/yygle37

### call_method_slots ###
Min: 0.780177 -> 0.535730: 1.4563x faster
Avg: 0.797951 -> 0.544117: 1.4665x faster
Significant (t=104.627536)
Stddev: 0.02414 -> 0.01733: 1.3926x smaller
Timeline: http://tinyurl.com/y76hawm

### call_method_unknown ###
Min: 0.808852 -> 0.610603: 1.3247x faster
Avg: 0.821008 -> 0.614395: 1.3363x faster
Significant (t=109.946891)
Stddev: 0.02158 -> 0.00800: 2.6994x smaller
Timeline: http://tinyurl.com/y43e5fl

### call_simple ###
Min: 0.602984 -> 0.484837: 1.2437x faster
Avg: 0.627628 -> 0.508925: 1.2332x faster
Significant (t=56.792486)
Stddev: 0.02009 -> 0.01587: 1.2658x smaller
Timeline: http://tinyurl.com/yyrerh8

### float ###
Min: 0.145489 -> 0.120753: 1.2048x faster
Avg: 0.157275 -> 0.131557: 1.1955x faster
Significant (t=29.200486)
Stddev: 0.01020 -> 0.00948: 1.0763x smaller
Timeline: http://tinyurl.com/y5h4frq

### hg_startup ###
Min: 0.013332 -> 0.016666: 1.2501x slower
Avg: 0.030811 -> 0.033631: 1.0915x slower
Significant (t=-7.625262)
Stddev: 0.00610 -> 0.00558: 1.0933x smaller
Timeline: http://tinyurl.com/y7c2vbv

### html5lib ###
Min: 16.772239 -> 13.632444: 1.2303x faster
Avg: 17.400199 -> 13.809100: 1.2601x faster
Significant (t=19.710438)
Stddev: 0.35648 -> 0.19722: 1.8075x smaller
Timeline: http://tinyurl.com/y52q84h

### html5lib_warmup ###
Min: 17.155307 -> 13.597860: 1.2616x faster
Avg: 17.758442 -> 14.069391: 1.2622x faster
Significant (t=12.638530)
Stddev: 0.58006 -> 0.29922: 1.9386x smaller
Timeline: http://tinyurl.com/y5ragx4

### iterative_count ###
Min: 0.272019 -> 0.144380: 1.8841x faster
Avg: 0.321844 -> 0.155405: 2.0710x faster
Significant (t=23.655896)
Stddev: 0.04319 -> 0.02469: 1.7493x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.46044492722&chco=FF0000,0000FF&chdl=/usr/bin/python|../wpython11/python&chds=0,1.46044492722&chd=t:0.28,0.28,0.28,0.28,0.33,0.33,0.31,0.31,0.29,0.3,0.32,0.35,0.29,0.3,0.29,0.28,0.27,0.27,0.27,0.29,0.32,0.35,0.31,0.28,0.27,0.3,0.35,0.3,0.29,0.28,0.3,0.29,0.31,0.31,0.33,0.32,0.34,0.41,0.34,0.33,0.33,0.34,0.34,0.36,0.4,0.43,0.46,0.41,0.38,0.35|0.3,0.15,0.15,0.15,0.17,0.15,0.15,0.15,0.16,0.15,0.14,0.14,0.2,0.16,0.17,0.19,0.16,0.16,0.16,0.2,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.15,0.16,0.16,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.14,0.15,0.14,0.14,0.15,0.14,0.17,0.15&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=iterative_count

### nbody ###
Min: 0.639303 -> 0.496505: 1.2876x faster
Avg: 0.663221 -> 0.507123: 1.3078x faster
Significant (t=42.102614)
Stddev: 0.01815 -> 0.01892: 1.0424x larger
Timeline: http://tinyurl.com/y64lglq

### normal_startup ###
Min: 0.374472 -> 0.461435: 1.2322x slower
Avg: 0.413358 -> 0.515210: 1.2464x slower
Significant (t=-17.591195)
Stddev: 0.02972 -> 0.02815: 1.0558x smaller
Timeline: http://tinyurl.com/y7qj6zz

### nqueens ###
Min: 0.698012 -> 0.507417: 1.3756x faster
Avg: 0.748165 -> 0.559723: 1.3367x faster
Significant (t=21.603119)
Stddev: 0.03138 -> 0.05310: 1.6921x larger
Timeline: http://tinyurl.com/y3xv95e

### pickle ###
Min: 1.584518 -> 1.526627: 1.0379x faster
Avg: 1.673835 -> 1.658376: 1.0093x faster
Not significant
Stddev: 0.06500 -> 0.07568: 1.1644x larger
Timeline: http://tinyurl.com/y4224pp

### pickle_dict ###
Min: 1.568636 -> 1.498363: 1.0469x faster
Avg: 1.618752 -> 1.575946: 1.0272x faster
Significant (t=4.120055)
Stddev: 0.04758 -> 0.05598: 1.1767x larger
Timeline: http://tinyurl.com/yyzl6b5

### pickle_list ###
Min: 0.771403 -> 0.752089: 1.0257x faster
Avg: 0.797367 -> 0.778438: 1.0243x faster
Significant (t=3.157783)
Stddev: 0.02620 -> 0.03332: 1.2721x larger
Timeline: http://tinyurl.com/yyp5cjx

### pybench ###
Min: 5994 -> 4470: 1.3409x faster
Avg: 6250 -> 4781: 1.3073x faster

### regex_compile ###
Min: 0.838116 -> 0.664657: 1.2610x faster
Avg: 0.846488 -> 0.691629: 1.2239x faster
Significant (t=31.710076)
Stddev: 0.01236 -> 0.03224: 2.6085x larger
Timeline: http://tinyurl.com/y65ceh8

### regex_effbot ###
Min: 0.169898 -> 0.152830: 1.1117x faster
Avg: 0.179772 -> 0.158301: 1.1356x faster
Significant (t=13.100118)
Stddev: 0.00746 -> 0.00887: 1.1895x larger
Timeline: http://tinyurl.com/yyazgxh

### regex_v8 ###
Min: 0.152255 -> 0.134914: 1.1285x faster
Avg: 0.159778 -> 0.144822: 1.1033x faster
Significant (t=10.310186)
Stddev: 0.00598 -> 0.00834: 1.3944x larger
Timeline: http://tinyurl.com/y4znhxx

### richards ###
Min: 0.361250 -> 0.281802: 1.2819x faster
Avg: 0.384307 -> 0.294562: 1.3047x faster
Significant (t=27.621845)
Stddev: 0.02043 -> 0.01052: 1.9419x smaller
Timeline: http://tinyurl.com/y3hx8w2

### slowpickle ###
Min: 0.826115 -> 0.610384: 1.3534x faster
Avg: 0.872314 -> 0.627799: 1.3895x faster
Significant (t=43.041072)
Stddev: 0.03384 -> 0.02165: 1.5626x smaller
Timeline: http://tinyurl.com/y4dr42c

### slowspitfire ###
Min: 0.820168 -> 0.697804: 1.1754x faster
Avg: 0.840062 -> 0.736274: 1.1410x faster
Significant (t=20.687150)
Stddev: 0.02540 -> 0.02477: 1.0256x smaller
Timeline: http://tinyurl.com/y6cn2c7

### slowunpickle ###
Min: 0.423866 -> 0.306436: 1.3832x faster
Avg: 0.431624 -> 0.308273: 1.4001x faster
Significant (t=103.485543)
Stddev: 0.00781 -> 0.00318: 2.4556x smaller
Timeline: http://tinyurl.com/y7p5ugb

### startup_nosite ###
Min: 0.182274 -> 0.166099: 1.0974x faster
Avg: 0.201290 -> 0.185015: 1.0880x faster
Significant (t=8.405736)
Stddev: 0.01255 -> 0.01474: 1.1748x larger
Timeline: http://tinyurl.com/y26jqjm

### threaded_count ###
Min: 0.292005 -> 0.174754: 1.6710x faster
Avg: 0.345331 -> 0.191805: 1.8004x faster
Significant (t=48.856578)
Stddev: 0.02041 -> 0.00877: 2.3267x smaller
Timeline: http://tinyurl.com/y6dl2e6

### unpack_sequence ###
Min: 0.000106 -> 0.000091: 1.1684x faster
Avg: 0.000114 -> 0.000099: 1.1433x faster
Significant (t=21.367174)
Stddev: 0.00009 -> 0.00012: 1.2958x larger
Timeline: http://tinyurl.com/y2sujno

### unpickle ###
Min: 0.908351 -> 0.803020: 1.1312x faster
Avg: 0.984448 -> 0.856525: 1.1494x faster
Significant (t=19.812585)
Stddev: 0.03248 -> 0.03209: 1.0122x smaller
Timeline: http://tinyurl.com/y4zmlaj

### unpickle_list ###
Min: 0.754476 -> 0.719254: 1.0490x faster
Avg: 0.802729 -> 0.759628: 1.0567x faster
Significant (t=6.699951)
Stddev: 0.03771 -> 0.02544: 1.4821x smaller
Timeline: http://tinyurl.com/y6tv2us


Report on Linux raffaello 2.6.31.12-0.2-desktop #1 SMP PREEMPT 2010-03-16 21:25:39 +0100 i686 athlon
Total CPU cores: 1

### 2to3 ###
43.432397 -> 43.283420: 1.0034x faster

### bzr_startup ###
Min: 0.140979 -> 0.144978: 1.0284x slower
Avg: 0.159606 -> 0.157596: 1.0128x faster
Significant (t=2.709326)
Stddev: 0.00578 -> 0.00465: 1.2418x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.175973&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.175973&chd=t:0.16,0.16,0.16,0.16,0.16,0.16,0.15,0.18,0.16,0.16,0.17,0.17,0.17,0.16,0.16,0.17,0.16,0.16,0.16,0.15,0.16,0.16,0.16,0.16,0.16,0.16,0.17,0.15,0.16,0.16,0.15,0.16,0.16,0.15,0.16,0.16,0.16,0.16,0.16,0.17,0.16,0.15,0.16,0.16,0.16,0.16,0.15,0.16,0.15,0.16,0.16,0.16,0.17,0.16,0.16,0.16,0.16,0.15,0.16,0.16,0.15,0.16,0.17,0.16,0.15,0.15,0.14,0.16,0.16,0.16,0.16,0.17,0.16,0.15,0.16,0.16,0.17,0.16,0.16,0.16,0.15,0.15,0.17,0.16,0.16,0.16,0.15,0.15,0.16,0.14,0.15,0.17,0.16,0.15,0.16,0.16,0.15,0.16,0.15,0.15|0.16,0.15,0.15,0.15,0.16,0.15,0.16,0.16,0.16,0.15,0.15,0.15,0.16,0.15,0.16,0.15,0.16,0.15,0.16,0.16,0.16,0.16,0.16,0.16,0.16,0.17,0.15,0.15,0.16,0.16,0.16,0.15,0.16,0.16,0.16,0.16,0.16,0.16,0.16,0.16,0.15,0.16,0.16,0.15,0.16,0.16,0.16,0.16,0.16,0.16,0.16,0.17,0.16,0.16,0.16,0.16,0.16,0.16,0.15,0.16,0.16,0.15,0.15,0.15,0.16,0.16,0.15,0.15,0.16,0.16,0.15,0.15,0.14,0.16,0.17,0.16,0.15,0.15,0.16,0.16,0.16,0.16,0.15,0.16,0.16,0.16,0.15,0.15,0.16,0.16,0.16,0.16,0.15,0.15,0.15,0.16,0.16,0.16,0.16,0.17&chxl=0:|1|20|40|60|80|100|2:||Iteration|3:||Time+(secs)&chtt=bzr_startup

### call_method ###
Min: 1.158909 -> 1.059187: 1.0941x faster
Avg: 1.161172 -> 1.113055: 1.0432x faster
Significant (t=22.522125)
Stddev: 0.00131 -> 0.02613: 19.9944x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.07623100281,2.1763420105&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.07623100281,2.1763420105&chd=t:1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.17,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.17,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16,1.16|1.13,1.16,1.13,1.11,1.11,1.14,1.16,1.08,1.1,1.14,1.14,1.14,1.1,1.13,1.15,1.15,1.13,1.14,1.17,1.11,1.1,1.11,1.14,1.11,1.13,1.11,1.14,1.11,1.11,1.15,1.13,1.18,1.16,1.1,1.1,1.1,1.12,1.08,1.11,1.09,1.09,1.09,1.12,1.16,1.08,1.1,1.08,1.12,1.13,1.15,1.14,1.16,1.13,1.14,1.16,1.09,1.14,1.15,1.13,1.11,1.1,1.09,1.1,1.1,1.11,1.11,1.11,1.14,1.15,1.12,1.13,1.14,1.16,1.14,1.09&chxl=0:|1|15|30|45|60|75|2:||Iteration|3:||Time+(secs)&chtt=call_method

### call_method_slots ###
Min: 1.149059 -> 1.078626: 1.0653x faster
Avg: 1.151797 -> 1.143283: 1.0074x faster
Significant (t=3.330294)
Stddev: 0.00124 -> 0.03128: 25.1750x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.09424901009,2.22079586983&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.09424901009,2.22079586983&chd=t:1.15,1.16,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.16,1.16,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15,1.15|1.16,1.13,1.2,1.16,1.16,1.15,1.17,1.13,1.14,1.13,1.16,1.19,1.13,1.17,1.17,1.14,1.11,1.14,1.19,1.2,1.17,1.22,1.2,1.14,1.14,1.15,1.21,1.16,1.19,1.1,1.15,1.13,1.15,1.13,1.09,1.18,1.18,1.14,1.13,1.13,1.12,1.15,1.18,1.17,1.19,1.21,1.19,1.19,1.22,1.18,1.18,1.17,1.16,1.16,1.18,1.18,1.16,1.16,1.16,1.16,1.17,1.16,1.14,1.13,1.12,1.14,1.15,1.19,1.14,1.15,1.15,1.15,1.15,1.13,1.1&chxl=0:|1|15|30|45|60|75|2:||Iteration|3:||Time+(secs)&chtt=call_method_slots

### call_method_unknown ###
Min: 1.170848 -> 1.155544: 1.0132x faster
Avg: 1.180379 -> 1.201501: 1.0179x slower
Significant (t=-9.479015)
Stddev: 0.01125 -> 0.02487: 2.2110x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.17149400711,2.26189613342&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.17149400711,2.26189613342&chd=t:1.19,1.2,1.21,1.19,1.2,1.2,1.18,1.17,1.18,1.17,1.17,1.19,1.21,1.2,1.17,1.17,1.17,1.17,1.19,1.19,1.17,1.18,1.17,1.18,1.17,1.2,1.17,1.22,1.2,1.19,1.2,1.19,1.2,1.19,1.19,1.21,1.2,1.19,1.2,1.2,1.19,1.19,1.19,1.19,1.19,1.2,1.21,1.19,1.17,1.19,1.18,1.17,1.19,1.17,1.17,1.19,1.17,1.18,1.17,1.17,1.17,1.17,1.18,1.17,1.19,1.17,1.18,1.18,1.17,1.18,1.17,1.17,1.17,1.19,1.2|1.26,1.24,1.21,1.2,1.22,1.23,1.22,1.21,1.22,1.24,1.23,1.2,1.21,1.25,1.23,1.2,1.19,1.19,1.2,1.19,1.2,1.24,1.2,1.19,1.2,1.21,1.24,1.22,1.24,1.19,1.18,1.2,1.21,1.18,1.2,1.21,1.2,1.17,1.19,1.19,1.22,1.2,1.2,1.19,1.2,1.2,1.18,1.2,1.23,1.24,1.25,1.23,1.21,1.19,1.2,1.24,1.24,1.21,1.23,1.24,1.23,1.24,1.18,1.2,1.19,1.21,1.23,1.24,1.25,1.24,1.23,1.24,1.23,1.2,1.2&chxl=0:|1|15|30|45|60|75|2:||Iteration|3:||Time+(secs)&chtt=call_method_unknown

### call_simple ###
Min: 0.905800 -> 0.908177: 1.0026x slower
Avg: 0.911217 -> 0.942381: 1.0342x slower
Significant (t=-18.575059)
Stddev: 0.00579 -> 0.01972: 3.4054x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.98918581009&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.98918581009&chd=t:0.91,0.91,0.92,0.91,0.92,0.91,0.91,0.91,0.91,0.92,0.92,0.92,0.91,0.93,0.91,0.91,0.91,0.92,0.92,0.91,0.91,0.91,0.91,0.91,0.91,0.91,0.93,0.91,0.92,0.93,0.92,0.91,0.92,0.91,0.91,0.91,0.91,0.91,0.91,0.91,0.91,0.91,0.91,0.92,0.93,0.91,0.93,0.91,0.91,0.91,0.91,0.91,0.91,0.91,0.92,0.92,0.91,0.93,0.92,0.91,0.91,0.91,0.91,0.91,0.91,0.92,0.93,0.93,0.93,0.91,0.91,0.91,0.91,0.91,0.91|0.99,0.95,0.93,0.95,0.95,0.94,0.94,0.96,0.93,0.97,0.96,0.95,0.97,0.94,0.96,0.95,0.95,0.94,0.97,0.96,0.94,0.96,0.98,0.93,0.94,0.96,0.97,0.94,0.97,0.97,0.95,0.95,0.94,0.96,0.96,0.93,0.92,0.95,0.96,0.97,0.92,0.95,0.96,0.94,0.91,0.96,0.97,0.95,0.94,0.95,0.92,0.95,0.95,0.97,0.93,0.94,0.95,0.96,0.97,0.94,0.96,0.96,0.95,0.94,0.99,0.99,0.97,0.94,0.97,0.97,0.96,0.95,0.96,0.98,0.95&chxl=0:|1|15|30|45|60|75|2:||Iteration|3:||Time+(secs)&chtt=call_simple

### float ###
Min: 0.222201 -> 0.224009: 1.0081x slower
Avg: 0.232227 -> 0.239783: 1.0325x slower
Significant (t=-9.550820)
Stddev: 0.00855 -> 0.00913: 1.0688x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.26341700554&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.26341700554&chd=t:0.24,0.24,0.25,0.25,0.24,0.24,0.25,0.24,0.25,0.24,0.24,0.24,0.24,0.23,0.25,0.24,0.25,0.23,0.25,0.23,0.24,0.25,0.24,0.23,0.24,0.25,0.25,0.25,0.24,0.25,0.24,0.24,0.23,0.25,0.24,0.25,0.23,0.25,0.24,0.24,0.25,0.25,0.23,0.24,0.24,0.24,0.25,0.24,0.24,0.24,0.25,0.23,0.25,0.24,0.24,0.23,0.25,0.23,0.24,0.25,0.24,0.23,0.24,0.24,0.24,0.25,0.24,0.25,0.24,0.25,0.24,0.25,0.24,0.25,0.23,0.25,0.24,0.25,0.25,0.25,0.23,0.24,0.25,0.24|0.25,0.25,0.26,0.26,0.24,0.25,0.26,0.25,0.26,0.25,0.25,0.25,0.25,0.24,0.26,0.25,0.25,0.24,0.25,0.24,0.25,0.26,0.25,0.24,0.25,0.25,0.26,0.26,0.25,0.25,0.25,0.26,0.24,0.26,0.25,0.25,0.25,0.26,0.25,0.26,0.26,0.25,0.23,0.24,0.25,0.24,0.25,0.24,0.25,0.24,0.24,0.23,0.25,0.24,0.26,0.24,0.25,0.25,0.25,0.25,0.25,0.23,0.25,0.25,0.24,0.25,0.25,0.25,0.25,0.26,0.24,0.26,0.25,0.25,0.24,0.26,0.24,0.25,0.26,0.25,0.24,0.25,0.25,0.25&chxl=0:|1|17|34|51|68|84|2:||Iteration|3:||Time+(secs)&chtt=float

### hg_startup ###
Min: 0.045993 -> 0.048992: 1.0652x slower
Avg: 0.057321 -> 0.056441: 1.0156x faster
Significant (t=4.488042)
Stddev: 0.00319 -> 0.00301: 1.0620x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.06599&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.06599&chd=t:0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06|0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.07,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.05,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06,0.06&chxl=0:|1|20|40|60|80|100|2:||Iteration|3:||Time+(secs)&chtt=hg_startup

### html5lib ###
Min: 26.507970 -> 25.616106: 1.0348x faster
Avg: 26.597557 -> 25.732688: 1.0336x faster
Significant (t=9.827764)
Stddev: 0.09216 -> 0.17386: 1.8865x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,24.616106,27.70594&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=24.616106,27.70594&chd=t:26.51,26.71,26.54,26.69,26.55|25.68,25.62,25.68,26.04,25.65&chxl=0:|1|2|3|4|5|2:||Iteration|3:||Time+(secs)&chtt=html5lib

### html5lib_warmup ###
Min: 25.655162 -> 25.466228: 1.0074x faster
Avg: 26.110781 -> 25.898441: 1.0082x faster
Not significant
Stddev: 0.26144 -> 0.25576: 1.0222x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,24.4662280083,27.2955319881&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=24.4662280083,27.2955319881&chd=t:25.66,26.3,26.27,26.18,26.16|25.47,26.04,26.12,25.89,25.97&chxl=0:|1|2|3|4|5|2:||Iteration|3:||Time+(secs)&chtt=html5lib_warmup

### iterative_count ###
Min: 0.369361 -> 0.223053: 1.6559x faster
Avg: 0.371506 -> 0.240774: 1.5430x faster
Significant (t=72.130793)
Stddev: 0.00198 -> 0.01266: 6.3935x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.38339400291&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.38339400291&chd=t:0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.38,0.37|0.25,0.25,0.23,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.23,0.22,0.23,0.23,0.23,0.23,0.24,0.25,0.25,0.25,0.24,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.24,0.25,0.25,0.25,0.24,0.25,0.25,0.25,0.24,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.24,0.25&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=iterative_count

### nbody ###
Min: 0.935157 -> 0.931795: 1.0036x faster
Avg: 0.946445 -> 0.943684: 1.0029x faster
Significant (t=2.384189)
Stddev: 0.00409 -> 0.00709: 1.7332x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.95390200615&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.95390200615&chd=t:0.94,0.95,0.94,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.94,0.95,0.94,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.94,0.95,0.95,0.95,0.95,0.94,0.94,0.94,0.94,0.94,0.94,0.94,0.94,0.95,0.94,0.95,0.94,0.95,0.95,0.95,0.95|0.94,0.93,0.94,0.95,0.94,0.95,0.94,0.93,0.93,0.94,0.95,0.93,0.94,0.94,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.95,0.94,0.94,0.93,0.93,0.95,0.94,0.93,0.94,0.94,0.94,0.95,0.95,0.95,0.94,0.94,0.95,0.93,0.94,0.94,0.95,0.95,0.95,0.93,0.95,0.95,0.94,0.95,0.94&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=nbody

### normal_startup ###
Min: 0.685616 -> 0.676500: 1.0135x faster
Avg: 0.686916 -> 0.678582: 1.0123x faster
Significant (t=31.273550)
Stddev: 0.00078 -> 0.00171: 2.1897x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.69004797935&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.69004797935&chd=t:0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69,0.69|0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68,0.68&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=normal_startup

### nqueens ###
Min: 0.980723 -> 0.947436: 1.0351x faster
Avg: 0.989169 -> 0.954421: 1.0364x faster
Significant (t=46.434070)
Stddev: 0.00394 -> 0.00353: 1.1181x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.99711680412&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.99711680412&chd=t:0.99,0.99,0.99,0.99,0.99,0.99,1.0,0.99,0.99,0.99,0.99,0.99,1.0,0.99,0.98,0.99,0.99,0.99,0.99,0.99,0.98,0.98,0.98,0.98,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.98,0.98,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.99,0.98,0.99,0.99,0.99|0.95,0.96,0.95,0.96,0.95,0.96,0.95,0.96,0.95,0.96,0.96,0.95,0.95,0.95,0.95,0.95,0.96,0.95,0.96,0.96,0.95,0.95,0.95,0.95,0.96,0.96,0.96,0.95,0.95,0.95,0.95,0.96,0.96,0.95,0.96,0.96,0.95,0.96,0.95,0.95,0.95,0.96,0.96,0.96,0.95,0.96,0.95,0.96,0.95,0.96&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=nqueens

### pickle ###
Min: 3.346728 -> 3.398232: 1.0154x slower
Avg: 3.367508 -> 3.415437: 1.0142x slower
Significant (t=-28.797501)
Stddev: 0.00840 -> 0.00824: 1.0186x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,2.34672808647,4.43019509315&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=2.34672808647,4.43019509315&chd=t:3.37,3.37,3.38,3.37,3.36,3.38,3.36,3.35,3.37,3.37,3.37,3.36,3.38,3.36,3.38,3.37,3.36,3.37,3.37,3.37,3.35,3.37,3.36,3.38,3.37,3.37,3.38,3.37,3.36,3.37,3.36,3.36,3.37,3.36,3.37,3.36,3.37,3.36,3.38,3.38,3.36,3.37,3.36,3.37,3.37,3.36,3.37,3.39,3.38,3.36|3.4,3.4,3.41,3.43,3.41,3.42,3.41,3.42,3.41,3.41,3.41,3.41,3.41,3.4,3.41,3.43,3.42,3.41,3.41,3.41,3.41,3.42,3.41,3.41,3.4,3.41,3.42,3.42,3.43,3.43,3.42,3.42,3.41,3.42,3.41,3.41,3.42,3.42,3.43,3.41,3.43,3.41,3.42,3.42,3.43,3.42,3.42,3.41,3.41,3.41&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=pickle

### pickle_dict ###
Min: 3.395274 -> 3.338732: 1.0169x faster
Avg: 3.513604 -> 3.359646: 1.0458x faster
Significant (t=16.225759)
Stddev: 0.06605 -> 0.01182: 5.5896x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,2.33873200417,4.60737299919&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=2.33873200417,4.60737299919&chd=t:3.49,3.56,3.59,3.57,3.51,3.46,3.59,3.55,3.4,3.52,3.49,3.59,3.57,3.43,3.55,3.6,3.5,3.43,3.43,3.56,3.42,3.44,3.52,3.53,3.6,3.6,3.41,3.46,3.4,3.46,3.4,3.54,3.57,3.54,3.55,3.6,3.58,3.48,3.48,3.42,3.6,3.57,3.4,3.61,3.5,3.51,3.45,3.54,3.54,3.57|3.36,3.35,3.34,3.36,3.38,3.37,3.36,3.35,3.35,3.35,3.36,3.36,3.34,3.36,3.37,3.36,3.36,3.35,3.38,3.34,3.36,3.37,3.39,3.38,3.35,3.36,3.35,3.35,3.36,3.36,3.36,3.34,3.37,3.36,3.38,3.37,3.38,3.35,3.35,3.36,3.36,3.38,3.37,3.34,3.35,3.35,3.35,3.36,3.35,3.36&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=pickle_dict

### pickle_list ###
Min: 1.720434 -> 1.708855: 1.0068x faster
Avg: 1.762757 -> 1.719942: 1.0249x faster
Significant (t=11.198322)
Stddev: 0.02604 -> 0.00727: 3.5808x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.70885491371,2.81176018715&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.70885491371,2.81176018715&chd=t:1.77,1.77,1.8,1.79,1.76,1.75,1.77,1.8,1.8,1.78,1.74,1.79,1.81,1.77,1.74,1.76,1.8,1.79,1.78,1.73,1.78,1.8,1.77,1.76,1.79,1.78,1.75,1.72,1.72,1.72,1.72,1.76,1.74,1.79,1.8,1.75,1.74,1.72,1.75,1.73,1.73,1.79,1.73,1.74,1.75,1.77,1.74,1.76,1.77,1.78|1.72,1.71,1.71,1.71,1.72,1.73,1.71,1.71,1.71,1.71,1.73,1.72,1.71,1.71,1.71,1.73,1.73,1.73,1.74,1.73,1.72,1.71,1.71,1.72,1.71,1.73,1.72,1.71,1.72,1.72,1.73,1.73,1.71,1.72,1.72,1.73,1.72,1.72,1.73,1.73,1.73,1.73,1.73,1.73,1.71,1.71,1.72,1.72,1.72,1.72&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=pickle_list

### pybench ###
Min: 8937 -> 8141: 1.0978x faster
Avg: 9069 -> 8266: 1.0971x faster

### regex_compile ###
Min: 1.297481 -> 1.230614: 1.0543x faster
Avg: 1.303290 -> 1.235283: 1.0551x faster
Significant (t=120.657667)
Stddev: 0.00304 -> 0.00257: 1.1834x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.23061418533,2.31539511681&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.23061418533,2.31539511681&chd=t:1.31,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.31,1.3,1.31,1.3,1.3,1.3,1.31,1.31,1.3,1.31,1.3,1.3,1.3,1.3,1.3,1.31,1.3,1.3,1.31,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.3,1.31,1.3,1.3,1.3,1.3,1.3,1.3,1.32|1.23,1.24,1.24,1.23,1.24,1.23,1.24,1.24,1.24,1.23,1.23,1.23,1.24,1.23,1.23,1.23,1.24,1.24,1.24,1.24,1.24,1.24,1.24,1.25,1.24,1.24,1.24,1.24,1.24,1.23,1.23,1.23,1.23,1.23,1.23,1.23,1.24,1.24,1.23,1.23,1.24,1.23,1.24,1.23,1.23,1.23,1.24,1.23,1.23,1.23&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=regex_compile

### regex_effbot ###
Min: 0.238711 -> 0.234200: 1.0193x faster
Avg: 0.239331 -> 0.236123: 1.0136x faster
Significant (t=19.737486)
Stddev: 0.00050 -> 0.00104: 2.0828x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.24141407013&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.24141407013&chd=t:0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24|0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.24&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=regex_effbot

### regex_v8 ###
Min: 0.229685 -> 0.217755: 1.0548x faster
Avg: 0.232979 -> 0.219208: 1.0628x faster
Significant (t=36.278688)
Stddev: 0.00217 -> 0.00157: 1.3824x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.23589801788&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.23589801788&chd=t:0.23,0.24,0.23,0.24,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.24,0.23,0.23,0.23,0.24,0.23,0.24,0.23,0.23,0.23,0.24,0.23,0.24,0.23,0.23,0.23,0.24,0.23,0.23,0.23,0.24,0.23,0.24,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.24,0.23,0.24|0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.23,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22,0.22&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=regex_v8

### richards ###
Min: 0.543314 -> 0.504176: 1.0776x faster
Avg: 0.550139 -> 0.542886: 1.0134x faster
Significant (t=3.118548)
Stddev: 0.00397 -> 0.01596: 4.0203x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.57444500923&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.57444500923&chd=t:0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.56,0.55,0.56,0.55,0.55,0.54,0.54,0.55,0.55,0.55,0.54,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.56,0.56,0.56,0.56,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55|0.53,0.55,0.55,0.55,0.55,0.54,0.53,0.53,0.55,0.54,0.55,0.55,0.56,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.55,0.54,0.56,0.53,0.51,0.56,0.51,0.56,0.52,0.56,0.51,0.54,0.57,0.53,0.57,0.52,0.57,0.54,0.54,0.53,0.53,0.52,0.54,0.5&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=richards

### slowpickle ###
Min: 1.453602 -> 1.361336: 1.0678x faster
Avg: 1.459776 -> 1.370334: 1.0653x faster
Significant (t=102.747004)
Stddev: 0.00249 -> 0.00563: 2.2567x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.36133599281,2.46742391586&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.36133599281,2.46742391586&chd=t:1.46,1.46,1.46,1.46,1.46,1.45,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.45,1.47,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46,1.46|1.37,1.38,1.37,1.36,1.38,1.38,1.38,1.37,1.37,1.37,1.37,1.38,1.38,1.38,1.38,1.37,1.37,1.36,1.36,1.36,1.36,1.36,1.36,1.36,1.37,1.37,1.36,1.37,1.37,1.37,1.37,1.37,1.37,1.38,1.38,1.38,1.38,1.37,1.37,1.37,1.37,1.37,1.38,1.38,1.37,1.37,1.37,1.36,1.36,1.36&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=slowpickle

### slowspitfire ###
Min: 1.507587 -> 1.393345: 1.0820x faster
Avg: 1.512317 -> 1.405533: 1.0760x faster
Significant (t=83.955024)
Stddev: 0.00415 -> 0.00798: 1.9254x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.39334487915,2.53158593178&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.39334487915,2.53158593178&chd=t:1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.53,1.52,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.52,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.51,1.53,1.51,1.51,1.51,1.51,1.52,1.51,1.51,1.51,1.51|1.41,1.41,1.42,1.39,1.41,1.41,1.4,1.42,1.4,1.39,1.4,1.39,1.4,1.4,1.42,1.41,1.4,1.4,1.42,1.4,1.4,1.41,1.42,1.4,1.42,1.41,1.41,1.41,1.41,1.41,1.4,1.4,1.4,1.4,1.41,1.41,1.41,1.4,1.41,1.4,1.4,1.4,1.41,1.41,1.42,1.41,1.4,1.41,1.4,1.4&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=slowspitfire

### slowunpickle ###
Min: 0.692674 -> 0.645382: 1.0733x faster
Avg: 0.695322 -> 0.648033: 1.0730x faster
Significant (t=102.284826)
Stddev: 0.00177 -> 0.00275: 1.5551x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.70394492149&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.70394492149&chd=t:0.69,0.69,0.7,0.69,0.69,0.69,0.7,0.69,0.7,0.7,0.69,0.7,0.69,0.7,0.69,0.7,0.69,0.7,0.7,0.7,0.7,0.69,0.7,0.69,0.7,0.69,0.7,0.69,0.69,0.7,0.69,0.7,0.7,0.69,0.69,0.7,0.7,0.7,0.69,0.7,0.7,0.7,0.7,0.69,0.7,0.7,0.69,0.7,0.7,0.7|0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.65,0.66,0.65,0.65,0.65,0.65&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=slowunpickle

### startup_nosite ###
Min: 0.247376 -> 0.246369: 1.0041x faster
Avg: 0.249051 -> 0.248113: 1.0038x faster
Significant (t=6.716428)
Stddev: 0.00109 -> 0.00088: 1.2345x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.25523996353&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.25523996353&chd=t:0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.26,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25|0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25,0.25&chxl=0:|1|20|40|60|80|100|2:||Iteration|3:||Time+(secs)&chtt=startup_nosite

### threaded_count ###
Min: 0.373155 -> 0.227307: 1.6416x faster
Avg: 0.374912 -> 0.234906: 1.5960x faster
Significant (t=224.886947)
Stddev: 0.00110 -> 0.00426: 3.8673x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.37840795517&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.37840795517&chd=t:0.37,0.37,0.38,0.37,0.37,0.37,0.38,0.38,0.37,0.38,0.37,0.37,0.37,0.38,0.38,0.38,0.37,0.38,0.38,0.38,0.38,0.38,0.37,0.37,0.37,0.38,0.37,0.38,0.37,0.37,0.37,0.37,0.38,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.37,0.38,0.37,0.38,0.37,0.37,0.38,0.38,0.37,0.37|0.24,0.24,0.24,0.24,0.23,0.23,0.23,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.24,0.24,0.23,0.23,0.24,0.24,0.24,0.24,0.24,0.24,0.24,0.23,0.23,0.23,0.23,0.23,0.23,0.23,0.25,0.24,0.25,0.24,0.24,0.23,0.24,0.23&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=threaded_count

### unpack_sequence ###
Min: 0.000150 -> 0.000159: 1.0605x slower
Avg: 0.000153 -> 0.000161: 1.0550x slower
Significant (t=-450.521988)
Stddev: 0.00000 -> 0.00000: 1.2070x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0,1.00053215027&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0,1.00053215027&chd=t:0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0|0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0&chxl=0:|1|20|40|60|80|100|2:||Iteration|3:||Time+(secs)&chtt=unpack_sequence

### unpickle ###
Min: 2.042838 -> 2.023408: 1.0096x faster
Avg: 2.054084 -> 2.037836: 1.0080x faster
Significant (t=13.396235)
Stddev: 0.00551 -> 0.00657: 1.1931x larger
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,1.0234079361,3.0667848587&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=1.0234079361,3.0667848587&chd=t:2.06,2.06,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.06,2.06,2.06,2.06,2.05,2.06,2.06,2.06,2.06,2.05,2.04,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.04,2.04,2.05,2.06,2.06,2.06,2.06,2.06,2.06,2.06,2.07,2.06,2.06,2.06,2.05,2.06|2.04,2.04,2.04,2.04,2.04,2.02,2.04,2.03,2.02,2.04,2.04,2.04,2.03,2.03,2.04,2.04,2.04,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.05,2.03,2.03,2.03,2.03,2.03,2.03,2.03,2.04,2.03,2.03,2.04,2.04,2.04,2.04,2.04,2.04,2.04,2.04,2.04,2.04,2.04,2.03,2.03,2.03&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=unpickle

### unpickle_list ###
Min: 1.542357 -> 1.645569: 1.0669x slower
Avg: 1.554601 -> 1.654697: 1.0644x slower
Significant (t=-93.061602)
Stddev: 0.00647 -> 0.00400: 1.6147x smaller
Timeline: http://chart.apis.google.com/chart?cht=lc&chs=700x400&chxt=x,y,x,y&chxr=1,0.54235696793,2.66085600853&chco=FF0000,0000FF&chdl=/btrfs/src/Python-2.6.4/python|/btrfs/src/wpython2-wpython11/python&chds=0.54235696793,2.66085600853&chd=t:1.56,1.55,1.55,1.56,1.55,1.56,1.56,1.56,1.56,1.55,1.55,1.54,1.56,1.55,1.55,1.54,1.55,1.55,1.55,1.54,1.54,1.55,1.55,1.54,1.55,1.54,1.55,1.55,1.56,1.55,1.55,1.55,1.56,1.55,1.56,1.56,1.56,1.56,1.56,1.56,1.56,1.56,1.56,1.57,1.56,1.56,1.56,1.56,1.57,1.56|1.66,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.65,1.66,1.66,1.65,1.66,1.66,1.65,1.66,1.66,1.66,1.66,1.66,1.66,1.65,1.66,1.65,1.66,1.66,1.66,1.66,1.66,1.65,1.66,1.66,1.65,1.66,1.66,1.66,1.66,1.66,1.65,1.66&chxl=0:|1|10|20|30|40|50|2:||Iteration|3:||Time+(secs)&chtt=unpickle_list


Report on Darwin unknown-00-1e-c2-bc-ea-b3.config 10.3.0 Darwin Kernel Version 10.3.0: Fri Feb 26 11:58:09 PST 2010; root:xnu-1504.3.12~1/RELEASE_I386 i386 i386
Total CPU cores: 2

### 2to3 ###
25.590659 -> 23.666681: 1.0813x faster

### bzr_startup ###
Min: 0.102069 -> 0.099751: 1.0232x faster
Avg: 0.102827 -> 0.100411: 1.0241x faster
Significant (t=20.360035)
Stddev: 0.00072 -> 0.00094: 1.3152x larger
Timeline: http://tinyurl.com/y6yjv5w

### call_method ###
Min: 0.606348 -> 0.548343: 1.1058x faster
Avg: 0.609875 -> 0.556685: 1.0955x faster
Significant (t=54.742949)
Stddev: 0.00303 -> 0.01151: 3.7924x larger
Timeline: http://tinyurl.com/y7wkkmp

### call_method_slots ###
Min: 0.641415 -> 0.549939: 1.1663x faster
Avg: 0.648512 -> 0.571999: 1.1338x faster
Significant (t=66.043832)
Stddev: 0.01162 -> 0.00815: 1.4253x smaller
Timeline: http://tinyurl.com/y7mlu86

### call_method_unknown ###
Min: 0.675142 -> 0.613596: 1.1003x faster
Avg: 0.685377 -> 0.616531: 1.1117x faster
Significant (t=35.991776)
Stddev: 0.02328 -> 0.00260: 8.9669x smaller
Timeline: http://tinyurl.com/y6p65wk

### call_simple ###
Min: 0.443526 -> 0.425943: 1.0413x faster
Avg: 0.447255 -> 0.442844: 1.0100x faster
Significant (t=4.469438)
Stddev: 0.00569 -> 0.01066: 1.8738x larger
Timeline: http://tinyurl.com/y8xbq2f

### float ###
Min: 0.102775 -> 0.096776: 1.0620x faster
Avg: 0.110484 -> 0.102809: 1.0747x faster
Significant (t=13.220150)
Stddev: 0.00738 -> 0.00546: 1.3507x smaller
Timeline: http://tinyurl.com/yyhutwh

### hg_startup ###
Min: 0.045108 -> 0.043234: 1.0433x faster
Avg: 0.046845 -> 0.043972: 1.0653x faster
Significant (t=28.354118)
Stddev: 0.00206 -> 0.00095: 2.1622x smaller
Timeline: http://tinyurl.com/y5b9xx5

### html5lib ###
Min: 15.549443 -> 14.847499: 1.0473x faster
Avg: 15.582542 -> 14.859007: 1.0487x faster
Significant (t=64.534012)
Stddev: 0.02167 -> 0.01261: 1.7190x smaller
Timeline: http://tinyurl.com/y3g6t44

### html5lib_warmup ###
Min: 15.770884 -> 15.074864: 1.0462x faster
Avg: 16.133120 -> 15.319287: 1.0531x faster
Significant (t=4.375747)
Stddev: 0.30506 -> 0.28266: 1.0793x smaller
Timeline: http://tinyurl.com/y2xcn3m

### iterative_count ###
Min: 0.147178 -> 0.085756: 1.7162x faster
Avg: 0.151184 -> 0.088620: 1.7060x faster
Significant (t=49.925293)
Stddev: 0.00651 -> 0.00601: 1.0834x smaller
Timeline: http://tinyurl.com/yybv496

### nbody ###
Min: 0.471700 -> 0.463253: 1.0182x faster
Avg: 0.483086 -> 0.475017: 1.0170x faster
Significant (t=3.488633)
Stddev: 0.01129 -> 0.01183: 1.0477x larger
Timeline: http://tinyurl.com/y6lrfst

### normal_startup ###
Min: 0.811946 -> 0.789491: 1.0284x faster
Avg: 0.854893 -> 0.819687: 1.0430x faster
Significant (t=5.095698)
Stddev: 0.03899 -> 0.02943: 1.3249x smaller
Timeline: http://tinyurl.com/yydc2u4

### nqueens ###
Min: 0.597376 -> 0.570333: 1.0474x faster
Avg: 0.606725 -> 0.588271: 1.0314x faster
Significant (t=5.653285)
Stddev: 0.00920 -> 0.02117: 2.3015x larger
Timeline: http://tinyurl.com/y3n2fg3

### pickle ###
Min: 1.651874 -> 1.574163: 1.0494x faster
Avg: 1.680315 -> 1.612453: 1.0421x faster
Significant (t=10.340275)
Stddev: 0.02313 -> 0.04023: 1.7395x larger
Timeline: http://tinyurl.com/y7r55ms

### pickle_dict ###
Min: 1.308464 -> 1.275010: 1.0262x faster
Avg: 1.318127 -> 1.296507: 1.0167x faster
Significant (t=4.484688)
Stddev: 0.00605 -> 0.03355: 5.5471x larger
Timeline: http://tinyurl.com/y4j9v5q

### pickle_list ###
Min: 0.743117 -> 0.803173: 1.0808x slower
Avg: 0.751905 -> 0.810111: 1.0774x slower
Significant (t=-44.249464)
Stddev: 0.00663 -> 0.00652: 1.0172x smaller
Timeline: http://tinyurl.com/y633yb6

### pybench ###
Min: 4763 -> 4342: 1.0970x faster
Avg: 4988 -> 4463: 1.1176x faster

### regex_compile ###
Min: 0.740278 -> 0.661458: 1.1192x faster
Avg: 0.764527 -> 0.685639: 1.1151x faster
Significant (t=15.011621)
Stddev: 0.02380 -> 0.02854: 1.1995x larger
Timeline: http://tinyurl.com/y524doe

### regex_effbot ###
Min: 0.096349 -> 0.096083: 1.0028x faster
Avg: 0.100523 -> 0.099285: 1.0125x faster
Not significant
Stddev: 0.00504 -> 0.00327: 1.5444x smaller
Timeline: http://tinyurl.com/y3e6z2j

### regex_v8 ###
Min: 0.107875 -> 0.104745: 1.0299x faster
Avg: 0.114243 -> 0.109286: 1.0454x faster
Significant (t=2.325803)
Stddev: 0.01377 -> 0.00612: 2.2522x smaller
Timeline: http://tinyurl.com/y4qvh3d

### richards ###
Min: 0.329455 -> 0.286851: 1.1485x faster
Avg: 0.340571 -> 0.298913: 1.1394x faster
Significant (t=13.324069)
Stddev: 0.01252 -> 0.01822: 1.4556x larger
Timeline: http://tinyurl.com/y3d8zxk

### slowpickle ###
Min: 0.717864 -> 0.646023: 1.1112x faster
Avg: 0.748511 -> 0.659941: 1.1342x faster
Significant (t=17.041455)
Stddev: 0.03039 -> 0.02067: 1.4701x smaller
Timeline: http://tinyurl.com/y5ht5y5

### slowspitfire ###
Min: 0.797233 -> 0.762146: 1.0460x faster
Avg: 0.839011 -> 0.812074: 1.0332x faster
Significant (t=4.203713)
Stddev: 0.02803 -> 0.03560: 1.2699x larger
Timeline: http://tinyurl.com/y7owc3g

### slowunpickle ###
Min: 0.320963 -> 0.289625: 1.1082x faster
Avg: 0.325532 -> 0.293422: 1.1094x faster
Significant (t=17.014061)
Stddev: 0.00791 -> 0.01075: 1.3598x larger
Timeline: http://tinyurl.com/y5dcwdj

### startup_nosite ###
Min: 0.210807 -> 0.219255: 1.0401x slower
Avg: 0.222933 -> 0.232971: 1.0450x slower
Significant (t=-4.776980)
Stddev: 0.01592 -> 0.01372: 1.1601x smaller
Timeline: http://tinyurl.com/y2cexr7

### threaded_count ###
Min: 0.195203 -> 0.113455: 1.7205x faster
Avg: 0.225064 -> 0.176248: 1.2770x faster
Significant (t=12.769360)
Stddev: 0.00850 -> 0.02566: 3.0192x larger
Timeline: http://tinyurl.com/y74c4w3

### unpack_sequence ###
Min: 0.000092 -> 0.000083: 1.1095x faster
Avg: 0.000094 -> 0.000085: 1.1058x faster
Significant (t=61.506288)
Stddev: 0.00002 -> 0.00002: 1.1541x smaller
Timeline: http://tinyurl.com/yykzcrg

### unpickle ###
Min: 1.026543 -> 1.018970: 1.0074x faster
Avg: 1.048295 -> 1.042098: 1.0059x faster
Not significant
Stddev: 0.01646 -> 0.03854: 2.3408x larger
Timeline: http://tinyurl.com/y786tft

### unpickle_list ###
Min: 0.908621 -> 0.905129: 1.0039x faster
Avg: 0.926660 -> 0.928462: 1.0019x slower
Not significant
Stddev: 0.01631 -> 0.01509: 1.0806x smaller
Timeline: http://tinyurl.com/y5m6s3u


From ncoghlan at gmail.com  Wed Jun 23 12:58:00 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 23 Jun 2010 20:58:00 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C21D15F.8070304@egenix.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<4C20FC54.9000608@egenix.com>
	<AANLkTin2CNb-zb3BpjSNWVIQDLCVwUh6-FNRQd4XKAP7@mail.gmail.com>
	<4C21D15F.8070304@egenix.com>
Message-ID: <AANLkTimXsLawYF06W8QTpIUX0evy1r0wBJdq8CpPv0NA@mail.gmail.com>

On Wed, Jun 23, 2010 at 7:18 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Note that the point of using a builtin method was to get
> better performance. Such type adaptions are often needed in
> loops, so adding a few extra Python function calls just to
> convert a str object to a bytes object or vice-versa is a
> bit much overhead.

I actually agree with that, I just think we need more real world
experience as to what works with the Python 3 text model before we
start messing with the APIs for the builtin objects (fair point that
"coerce" is a loaded term given the existence of the old coercion
protocol. It's the right word for the task though).

One of the key points coming out of this thread (to my mind) is the
lack of a Text ABC or other way of making an object that can be passed
to functions expecting a str instance with a reasonable expectation of
having it work. Are there some core string capabilities that can be
identified and then expanded out to a full str-compatible API? (i.e.
something along the lines of what collections.MutableMapping now
provides for dict-alikes).

However, even if something like that was added, PJE is correct in
pointing out that builtin strings still don't play well with others in
many cases (usually due to underlying optimisations or other sound
reasons, but perhaps sometimes gratuitously). Most of the string
binary operations can be dealt with through their reflected forms, but
str.__mod__ will never return NotImplemented, __contains__ has no
reflected form and the actual method calls are of course right out
(e.g. the arguments to str.join() or str.split() calls have no ability
to affect the type of the result).

Third party number implementations couldn't provide comparable
funtionality to builtin int and long objects until the __index__
protocol was added. Perhaps PJE is right that what this is really
crying out for is a way to have third party "real string"
implementations so that there can actually be genuine experimentation
in the Unicode handling space outside the language core (comparable to
the difference between the "you can turn me into an int" __int__
method and the "I am an int equivalent" __index__ method).

That may be tapping in a nail with a sledgehammer (and would raise
significant moratorium questions if pursued further), but I think it's
a valid question to at least ask.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at pearwood.info  Wed Jun 23 13:12:40 2010
From: steve at pearwood.info (Steven D'Aprano)
Date: Wed, 23 Jun 2010 21:12:40 +1000
Subject: [Python-Dev] WPython 1.1 was released
In-Reply-To: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>
References: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>
Message-ID: <201006232112.41047.steve@pearwood.info>

On Wed, 23 Jun 2010 08:12:36 pm Cesare Di Mauro wrote:
> I've released WPython 1.1, which brings many optimizations and
> refactorings.

For those of us who don't know what WPython is, and are too lazy, too 
busy, or reading their email off-line, could you give us a one short 
paragraph description of what it is?

Actually, since I'm none of the above, I'll answer my own question: 
WPython is an implementation of Python that uses 16-bit wordcodes 
instead of byte code, and claims to have various performance benefits 
from doing so.

It looks like good work, thank you.


-- 
Steven D'Aprano

From cesare.di.mauro at gmail.com  Wed Jun 23 13:28:58 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Wed, 23 Jun 2010 13:28:58 +0200
Subject: [Python-Dev] WPython 1.1 was released
In-Reply-To: <201006232112.41047.steve@pearwood.info>
References: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>
	<201006232112.41047.steve@pearwood.info>
Message-ID: <AANLkTik-L_dmJnhS81jgUO6wHYu-73-twkrYjFgloHyf@mail.gmail.com>

2010/6/23 Steven D'Aprano <steve at pearwood.info>

> On Wed, 23 Jun 2010 08:12:36 pm Cesare Di Mauro wrote:
> > I've released WPython 1.1, which brings many optimizations and
> > refactorings.
>
> For those of us who don't know what WPython is, and are too lazy, too
> busy, or reading their email off-line, could you give us a one short
> paragraph description of what it is?
>
> Actually, since I'm none of the above, I'll answer my own question:
> WPython is an implementation of Python that uses 16-bit wordcodes
> instead of byte code, and claims to have various performance benefits
> from doing so.
>
> It looks like good work, thank you.
>
> --
> Steven D'Aprano
>

Hi Steven,

sorry, I made a mistake, assuming that the project was known.

WPython is a CPython 2.6.4 implementation that uses "wordcodes" instead of
bytecodes. A wordcode is a word (16 bits, two bytes, in this case) used to
represent VM opcodes. This new encoding enabled to simplify the execution of
the virtual machine main cycle, improving understanding, maintenance, and
extensibility; less space is required on average, and execution speed is
improved too.

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/19781450/attachment.html>

From steve at holdenweb.com  Wed Jun 23 14:17:20 2010
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 23 Jun 2010 08:17:20 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
	<hvqorq$i69$1@dough.gmane.org>
	<AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>
Message-ID: <4C21FB50.1080905@holdenweb.com>

Guido van Rossum wrote:
> On Tue, Jun 22, 2010 at 9:37 AM, Tres Seaver <tseaver at palladion.com> wrote:
>> Any "turdiness" (which I am *not* arguing for) is a natural consequence
>> of the kinds of backward incompatibilities which were *not* ruled out
>> for Python 3, along with the (early, now waning) "build it and they will
>>  come" optimism about adoption rates.
> 
> FWIW, my optimisim is *not* waning. I think it's good that we're
> having this discussion and I expect something useful will come out of
> it; I also expect in general that the (admittedly serious) problem of
> having to port all dependencies will be solved in the next few years.
> Not by magic, but because many people are taking small steps in the
> right direction, and there will be light eventually. In the mean time
> I don't blame anyone for sticking with 2.x or being too busy to help
> port stuff to 3.x. Python 3 has been a long time in the making -- it
> will be a bit longer still, which was expected.
> 
+1

The important thing is to avoid bigotry and FUD, and deal with things
the way they are. The #python IRC team have just helped us make a major
step forward. This won't be a campaign with a victorious charge over
some imaginary finish line.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Wed Jun 23 14:17:20 2010
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 23 Jun 2010 08:17:20 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
	<hvqorq$i69$1@dough.gmane.org>
	<AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>
Message-ID: <4C21FB50.1080905@holdenweb.com>

Guido van Rossum wrote:
> On Tue, Jun 22, 2010 at 9:37 AM, Tres Seaver <tseaver at palladion.com> wrote:
>> Any "turdiness" (which I am *not* arguing for) is a natural consequence
>> of the kinds of backward incompatibilities which were *not* ruled out
>> for Python 3, along with the (early, now waning) "build it and they will
>>  come" optimism about adoption rates.
> 
> FWIW, my optimisim is *not* waning. I think it's good that we're
> having this discussion and I expect something useful will come out of
> it; I also expect in general that the (admittedly serious) problem of
> having to port all dependencies will be solved in the next few years.
> Not by magic, but because many people are taking small steps in the
> right direction, and there will be light eventually. In the mean time
> I don't blame anyone for sticking with 2.x or being too busy to help
> port stuff to 3.x. Python 3 has been a long time in the making -- it
> will be a bit longer still, which was expected.
> 
+1

The important thing is to avoid bigotry and FUD, and deal with things
the way they are. The #python IRC team have just helped us make a major
step forward. This won't be a campaign with a victorious charge over
some imaginary finish line.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From alexander.belopolsky at gmail.com  Wed Jun 23 16:06:27 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 23 Jun 2010 10:06:27 -0400
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <B2467B5D-5751-4557-9488-356D5F95660B@mac.com>
References: <73196.1277143019@parc.com>
	<AANLkTimSSnMyae9rwUog2dEqtWrPPey5yysLdyukClyc@mail.gmail.com>
	<75635.1277147585@parc.com> <20100621212904.7bec83f6@pitrou.net>
	<77297.1277150242@parc.com>
	<1277150570.3369.1.camel@localhost.localdomain>
	<4C1FC7E6.5070707@voidspace.org.uk> <4C1FD5D6.7070007@v.loewis.de>
	<4C1FD84B.3030202@voidspace.org.uk> <4C1FDB65.4020503@v.loewis.de>
	<4C1FDF1C.2060308@voidspace.org.uk> <4C1FE4AF.80009@v.loewis.de>
	<AANLkTimXfNNH7iInOOKiTCj1RpOf6CInO9ABxpGTTgDQ@mail.gmail.com>
	<EAE2C517-C5EA-4EF7-A4A0-286C3B08381D@mac.com>
	<AANLkTimrLwsklrQzBLbjf0LOCycp_gRa97gqc721NsNs@mail.gmail.com>
	<B2467B5D-5751-4557-9488-356D5F95660B@mac.com>
Message-ID: <AANLkTik3LnZvI1gV_AA2mHmlmMYXKpQ9qy7O8sdFt04w@mail.gmail.com>

On Wed, Jun 23, 2010 at 2:08 AM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
..
> I don't agree. ?The patch itself is pretty simple, but it does make a rather significant change to the build process: the
> compile-time environment in configure would be different than during the compilation of posixmodule. That is, in functions
> that check for features (the HAVE_FOOBAR macros in pyconfig.h) would use _DARWIN_C_SOURCE while posixmodule
> itself wouldn't. ? ?This may lead to subtle bugs, or even compile errors (because some function definitions change when
> _DARWIN_C_SOURCE active).

I agree.  Messing with compatibility macros outside of pyconfig.h is
not a good idea.  Martin's hack, while likely to work in most cases,
is still a hack.  I believe, however we can undefine _DARWIN_C_SOURCE
globally at least on 10.4 and higher.  I grepped throught the headers
on my 10.6 system and I notice that the majority of checks for
_DARWIN_C_SOURCE are in the form of

#if !defined(_POSIX_C_SOURCE) || defined(_DARWIN_C_SOURCE)

According to a comment in configure,

  # On Mac OS X 10.4, defining _POSIX_C_SOURCE or _XOPEN_SOURCE
  # disables platform specific features beyond repair.
  # On Mac OS X 10.3, defining _POSIX_C_SOURCE or _XOPEN_SOURCE
  # has no effect, don't bother defining them

_POSIX_C_SOURCE is already undefined in python headers, so undefining
_DARWIN_C_SOURCE will have no effect on the majority of checks.

I was able to find very few exceptions:  some cases check
_XOPEN_SOURCE instead or in addition to _POSIX_C_SOURCE before
ignoring _DARWIN_C_SOURCE:

/usr/include/grp.h:#if !defined(_XOPEN_SOURCE) || defined(_DARWIN_C_SOURCE)
/usr/include/pwd.h:#if (!defined(_POSIX_C_SOURCE) &&
!defined(_XOPEN_SOURCE)) || defined(_DARWIN_C_SOURCE)
..

Since _XOPEN_SOURCE is similarly undefined in python headers, these
cases are unaffected as well.

This leaves a handful of cases where Apple provides additional macros
for fine grained control:

/usr/include/stdio.h:#if defined(__DARWIN_10_6_AND_LATER) &&
(defined(_DARWIN_UNLIMITED_STREAMS) || defined(_DARWIN_C_SOURCE))
/usr/include/unistd.h:#if defined(_DARWIN_UNLIMITED_GETGROUPS) ||
defined(_DARWIN_C_SOURCE)

The second line above is our dear friend and the _DARWIN_C_SOURCE
behavior conditioned on the first line can be enabled by defining
_DARWIN_UNLIMITED_STREAMS macro.

I believe _DARWIN_C_SOURCE casts its net to wide and more targeted
macros should be used instead.

..
> ? ? Defining _POSIX_C_SOURCE or _DARWIN_C_SOURCE causes library and kernel calls to conform
>     to the SUSv3 standards even if doing so would alter? the behavior of functions used in 10.3.

I cannot reconcile this with !defined(_POSIX_C_SOURCE) ||
defined(_DARWIN_C_SOURCE) logic that I see in the headers.

From pje at telecommunity.com  Wed Jun 23 16:24:18 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Wed, 23 Jun 2010 10:24:18 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <5A4340BB-7B64-4C76-81FF-8A43F179AA7A@twistedmatrix.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<AANLkTing604RkO4B6gR5ZqCHcHP1o9Ng71OfqK97HrR5@mail.gmail.com>
	<AANLkTim_2Z6K5GqhnQ6rpGmN30WqSjBjLsHgNWyZKo3d@mail.gmail.com>
	<5A4340BB-7B64-4C76-81FF-8A43F179AA7A@twistedmatrix.com>
Message-ID: <20100623142422.36F873A404D@sparrow.telecommunity.com>

At 08:34 PM 6/22/2010 -0400, Glyph Lefkowitz wrote:
>I suspect the practical problem here is that there's no CharacterString ABC

That, and the absence of a string coercion protocol so that mixing 
your custom string with standard strings will do the right thing for 
your intended use.


From alexander.belopolsky at gmail.com  Wed Jun 23 16:48:24 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 23 Jun 2010 10:48:24 -0400
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
Message-ID: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>

On Wed, Jun 23, 2010 at 2:08 AM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
..
>>
>>> * [Ronald's proposal] results in posix.getgroups not reflecting results of posix.setgroups
>>>
>>
>> This effectively substitutes getgrouplist called on the current user
>> for getgroups. ?In 3.x, I believe the correct action will be to
>> provide direct access to getgrouplist which is while not POSIX (yet?),
>> is widely available.
>
> I don't mind adding getgrouplist, but that issue is seperator from this one. BTW. Appearently getgrouplist is posix
> (<http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/libc.html>), although this isn't a
> requirement for being added to the posix module.
>

(The link you provided leads to "Linux Standard Base Core
Specification," which is different from POSIX, but the distinction is
not relevant for our discussion.)

>
> It is still my opinion that the second option is preferable for better compatibility with system tools, even if the patch
> is more complicated and the library function we use can be considered to be broken.

Let me try to formulate what the disagreement is.  There are two
different group lists that can be associated with a running process:
1) The list of current supplementary group IDs maintained by the
system for each process and stored in per-process system tables; and
2) The list of the groups that include the uid under which the process
is running as a member.

The first list is returned by a system call getgroups and the second
can be obtained using system database access functions as follows:

pw = getpwuid(getuid())
getgrouplist(pw->pw_name, ..)

The first list can be modified by privileged processes using setgroups
system call, while the second changes when system databases change.

The problem that _DARWIN_C_SOURCE introduces is that it replaces
system getgroups with a database query effectively making the true
process' list of supplementary group IDs inaccessible to programs.
See source code at
<http://www.opensource.apple.com/source/Libc/Libc-594.1.4/sys/getgroups.c>.

The problem is complicated by the fact that OSX true getgroups call
appears to truncate the list of groups to NGROUPS_MAX=16.  Note,
however that it is not clear whether the system call truncates the
list or the underlying process tables are limited to 16 entries and
additional groups are ignored when the process is created.

In my view, getgroups and getgrouplist are two fundamentally different
operations and both should be provided by the os module.  Redefining
os.getgroups to invoke getgrouplist instead of system getgroups on one
particular platform to work around that platform's system call
limitation is not right.

From ronaldoussoren at mac.com  Wed Jun 23 17:03:39 2010
From: ronaldoussoren at mac.com (ronaldoussoren)
Date: Wed, 23 Jun 2010 08:03:39 -0700 (PDT)
Subject: [Python-Dev] red buildbots on 2.7
In-Reply-To: <AANLkTik3LnZvI1gV_AA2mHmlmMYXKpQ9qy7O8sdFt04w@mail.gmail.com>
Message-ID: <91321b7f-d5a2-6f2f-8ecd-813636aaa3bd@me.com>


On 23 Jun, 2010,at 04:06 PM, Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:

On Wed, Jun 23, 2010 at 2:08 AM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
..
> I don't agree. ?The patch itself is pretty simple, but it does make a rather significant change to the build process: the
> compile-time environment in configure would be different than during the compilation of posixmodule. That is, in functions
> that check for features (the HAVE_FOOBAR macros in pyconfig.h) would use _DARWIN_C_SOURCE while posixmodule
> itself wouldn't. ? ?This may lead to subtle bugs, or even compile errors (because some function definitions change when
> _DARWIN_C_SOURCE active).

I agree. Messing with compatibility macros outside of pyconfig.h is
not a good idea. Martin's hack, while likely to work in most cases,
is still a hack. I believe, however we can undefine _DARWIN_C_SOURCE
globally at least on 10.4 and higher. I grepped throught the headers
on my 10.6 system and I notice that the majority of checks for
_DARWIN_C_SOURCE are in the form of

As I wrote the system will assume _DARWIN_C_SOURCE is set when ?when you don't set _POSIX_C_SOURCE or other feature macros. ? Working around that is a hack that I don't wish to support.


..
> ? ? Defining _POSIX_C_SOURCE or _DARWIN_C_SOURCE causes library and kernel calls to conform
> to the SUSv3 standards even if doing so would alter? the behavior of functions used in 10.3.

I cannot reconcile this with !defined(_POSIX_C_SOURCE) ||
defined(_DARWIN_C_SOURCE) logic that I see in the headers.

This seems to be arranged in sys/cdefs.h. ? I honestly don't care how this done, the documentation clearly says that this happens and that indicates that _DARWIN_C_SOURCE selects the API Apple would like you to use.

Anyway, why is this discusion on python-dev instead of in the issue tracker?

BTW. IMHO resolution of this issue can wait until after 2.7.0, there is always 2.7.1 and I don't think we need to rush this (the issue has been dormant for quite a while)

Ronald

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/f99d2611/attachment.html>

From tseaver at palladion.com  Wed Jun 23 17:30:23 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 23 Jun 2010 11:30:23 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <hvt9af$t6n$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen J. Turnbull wrote:

> We do need str-based implementations of modules like urllib.

Why would that be?  URLs aren't text, and never will be.  The fact that
to the eye they may seem to be text-ish doesn't make them text.  This
*is* a case where "dont make me think" is a losing propsition:
programmers who work with URLs in any non-opaque way as text are
eventually going to be bitten by this issue no matter how hard we wave
our hands.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwiKI4ACgkQ+gerLs4ltQ56/QCbBPdj8jaPbcvPIDPb7ys04oHg
fLIAnR+kA2udazsnpzTp2INGz2CoWgzj
=Swjw
-----END PGP SIGNATURE-----


From alexander.belopolsky at gmail.com  Wed Jun 23 17:37:12 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Wed, 23 Jun 2010 11:37:12 -0400
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
Message-ID: <AANLkTilhnIMiiKAkDnat99ND788MjF8iO1qEx-TFDJWJ@mail.gmail.com>

In my previous post, I forgot to include the link to the tracker issue
where this problem is being worked on.

http://bugs.python.org/issue7900

I'll repost my message there as an issue comment, so that a more
detailed technical discussion can continue there.

From tseaver at palladion.com  Wed Jun 23 17:37:53 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 23 Jun 2010 11:37:53 -0400
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
Message-ID: <hvt9oh$t4v$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Alexander Belopolsky wrote:

> In my view, getgroups and getgrouplist are two fundamentally different
> operations and both should be provided by the os module.  Redefining
> os.getgroups to invoke getgrouplist instead of system getgroups on one
> particular platform to work around that platform's system call
> limitation is not right.

+1.  syscall wrappers should err on the side of thinness, even to the
point of anorexia.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwiKlEACgkQ+gerLs4ltQ4vKwCg3JwpWvivq8Dk7PYy2iPrKq/E
88gAn1lfeEcDJlfGm+F0jEbxsv1BfQJW
=JzHS
-----END PGP SIGNATURE-----


From guido at python.org  Wed Jun 23 17:43:46 2010
From: guido at python.org (Guido van Rossum)
Date: Wed, 23 Jun 2010 08:43:46 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <hvt9af$t6n$1@dough.gmane.org>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<hvt9af$t6n$1@dough.gmane.org>
Message-ID: <AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>

On Wed, Jun 23, 2010 at 8:30 AM, Tres Seaver <tseaver at palladion.com> wrote:
> Stephen J. Turnbull wrote:
>
>> We do need str-based implementations of modules like urllib.
>
> Why would that be? ?URLs aren't text, and never will be. ?The fact that
> to the eye they may seem to be text-ish doesn't make them text. ?This
> *is* a case where "dont make me think" is a losing propsition:
> programmers who work with URLs in any non-opaque way as text are
> eventually going to be bitten by this issue no matter how hard we wave
> our hands.

This has been asserted and contested several times now, and I don't
see the two positions getting any closer.

So I propose that we drop the discussion "are URLs text or bytes" and
try to find something more pragmatic to discuss.

For example: how we can make the suite of functions used for URL
processing more polymorphic, so that each developer can choose for
herself how URLs need to be treated in her application.

-- 
--Guido van Rossum (python.org/~guido)

From cyounkins at gmail.com  Wed Jun 23 17:51:31 2010
From: cyounkins at gmail.com (Craig Younkins)
Date: Wed, 23 Jun 2010 11:51:31 -0400
Subject: [Python-Dev] Use of cgi.escape can lead to XSS vulnerabilities
In-Reply-To: <10286.1277242190@parc.com>
References: <AANLkTil9vEOPLgenyPOu-F3OuC-R61ak_KoEvNARrGvW@mail.gmail.com>
	<10286.1277242190@parc.com>
Message-ID: <AANLkTimbWvx7PGi_owX_EFFWVBMGZSRYHWiA5YLPtteT@mail.gmail.com>

http://bugs.python.org/issue9061

On Tue, Jun 22, 2010 at 5:29 PM, Bill Janssen <janssen at parc.com> wrote:

> Craig Younkins <cyounkins at gmail.com> wrote:
>
> > cgi.escape never escapes single quote characters, which can easily lead
> to a
> > Cross-Site Scripting (XSS) vulnerability. This seems to be known by many,
> > but a quick search reveals many are using cgi.escape for HTML attribute
> > escaping.
>
> Did you file a bug report?
>
> Bill
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/b05c3ee0/attachment.html>

From barry at python.org  Wed Jun 23 18:03:27 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 23 Jun 2010 12:03:27 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
Message-ID: <20100623120327.3bd030e9@heresy>

On Jun 23, 2010, at 08:43 AM, Guido van Rossum wrote:

>So I propose that we drop the discussion "are URLs text or bytes" and
>try to find something more pragmatic to discuss.

email has exactly the same question, and the answer is "yes". <wink>

>For example: how we can make the suite of functions used for URL
>processing more polymorphic, so that each developer can choose for
>herself how URLs need to be treated in her application.

I think email package hackers should watch this effort closely.  RDM has
written some stuff up on how we think we're going to handle this, though it's
probably pretty email package specific.  Maybe there's a better, general, or
conventional approach lurking around somewhere.

http://wiki.python.org/moin/Email%20SIG

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/d7cdbaa4/attachment.pgp>

From janssen at parc.com  Wed Jun 23 18:11:05 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 23 Jun 2010 09:11:05 PDT
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <hvt9af$t6n$1@dough.gmane.org>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
Message-ID: <13070.1277309465@parc.com>

Tres Seaver <tseaver at palladion.com> wrote:

> Stephen J. Turnbull wrote:
> 
> > We do need str-based implementations of modules like urllib.
> 
> Why would that be?  URLs aren't text, and never will be.  The fact that
> to the eye they may seem to be text-ish doesn't make them text.  This

URLs are exactly text (strings, representable as Unicode strings in
Py3K), and were designed as such from the start.  The fact that some of
the things tunneled or carried in URLs are string representations of
non-string data shouldn't obscure that point.  They're not "text-ish",
they're text.  They're not opaque, either; they break down in
well-specified ways, mainly into strings.

The trouble comes in when we try to go beyond the spec, or handle things
that don't conform to the spec.  Sure, a path component of a URI might
actually be a %-escaped sequence of arbitrary bytes, even bytes that
don't represent a string in any known encoding, but that's only *after*
reversing the %-escapes, which should happen in a scheme-specific piece
of code, not in generic URL parsing or manipulation.

Bill


From ianb at colorstudy.com  Wed Jun 23 18:30:51 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 23 Jun 2010 11:30:51 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <hvt9af$t6n$1@dough.gmane.org>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<hvt9af$t6n$1@dough.gmane.org>
Message-ID: <AANLkTikwIxoxWas-rCz508GIozIueuu6bs-Pk0edao-T@mail.gmail.com>

On Wed, Jun 23, 2010 at 10:30 AM, Tres Seaver <tseaver at palladion.com> wrote:

>  Stephen J. Turnbull wrote:
>
> > We do need str-based implementations of modules like urllib.
>
>
> Why would that be?  URLs aren't text, and never will be.  The fact that
> to the eye they may seem to be text-ish doesn't make them text.  This
> *is* a case where "dont make me think" is a losing propsition:
> programmers who work with URLs in any non-opaque way as text are
> eventually going to be bitten by this issue no matter how hard we wave
> our hands.
>

HTML is text, and URLs are embedded in that text, so it's easy to get a URL
that is text.  Though, with a little testing, I notice that text alone can't
tell you what the right URL really is (at least the intended URL when unsafe
characters are embedded in HTML).

To test I created two pages, one in Latin-1 another in UTF-8, and put in the
link:

  ./test.html?param=R?union

On a Latin-1 page it created a link to test.html?param=R%E9union and on a
UTF-8 page it created a link to test.html?param=R%C3%A9union (the second
link displays in the URL bar as test.html?param=R?union but copies with
percent encoding).  Though if you link to ./R?union.html then both pages
create UTF-8 links.  And both pages also link
http://R?union.com<http://xn--runion-bva.com>to
http://xn--runion-bva.com/.  So really neither bytes nor text works
completely; query strings receive the encoding of the page, which would be
handled transparently if you worked on the page's bytes.  Path and domain
are consistently encoded with UTF-8 and punycode respectively and so would
be handled best when treated as text.  And of course if you are a page with
a non-ASCII-compatible encoding you really must handle encodings before the
URL is sensible.

Another issue here is that there's no "encoding" for turning a URL into
bytes if the URL is not already ASCII.  A proper way to encode a URL would
be:

(Totally as an aside, as I remind myself of new module names I notice it's
not easy to google specifically for Python 3 docs, e.g. "python 3 urlsplit"
gives me 2.6 docs)

from urllib.parse import urlsplit, urlunsplit
import encodings.idna

def encode_http_url(url, page_encoding='ASCII', errors='strict'):
    scheme, netloc, path, query, fragment = urlsplit(url)
    scheme = scheme.encode('ASCII', errors)
    auth = port = None
    if '@' in netloc:
        auth, netloc = netloc.split('@', 1)
    if ':' in netloc:
        netloc, port = netloc.split(':', 1)
    netloc = encodings.idna.ToASCII(netloc)
    if port:
        netloc = netloc + b':' + port.encode('ASCII', errors)
    if auth:
        netloc = auth.encode('UTF-8', errors) + b'@' + netloc
    path = path.encode('UTF-8', errors)
    query = query.encode(page_encoding, errors)
    fragment = fragment.encode('UTF-8', errors)
    return urlunsplit_bytes((scheme, netloc, path, query, fragment))

Where urlunsplit_bytes handles bytes (urlunsplit does not).  It's helpful
for me at least to look at that code specifically:

def urlunsplit(components):
    scheme, netloc, url, query, fragment = components
    if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
        if url and url[:1] != '/': url = '/' + url
        url = '//' + (netloc or '') + url
    if scheme:
        url = scheme + ':' + url
    if query:
        url = url + '?' + query
    if fragment:
        url = url + '#' + fragment
    return url

In this case it really would be best to have Python 2's system where things
are coerced to ASCII implicitly.  Or, more specifically, if all those string
literals in that routine could be implicitly converted to bytes using
ASCII.  Conceptually I think this is reasonable, as for URLs (at least with
HTTP, but in practice I think this applies to all URLs) the ASCII bytes
really do have meaning.  That is, '/' (*in the context of urlunsplit*)
really is \x2f specifically.  Or another example, making a GET request
really means sending the bytes \x47\x45\x54 and there is no other set of
bytes that has that meaning.  The WebSockets specification for instance
defines things like "colon":
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-76#page-5 -- in
an earlier version they even used bytes to describe HTTP (
http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-54#page-13),
though this annoyed many people.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/24e8c28a/attachment.html>

From janssen at parc.com  Wed Jun 23 18:46:48 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 23 Jun 2010 09:46:48 PDT
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
Message-ID: <13837.1277311608@parc.com>

Guido van Rossum <guido at python.org> wrote:

> So I propose that we drop the discussion "are URLs text or bytes" and
> try to find something more pragmatic to discuss.
> 
> For example: how we can make the suite of functions used for URL
> processing more polymorphic, so that each developer can choose for
> herself how URLs need to be treated in her application.

While I agree with "find something more pragmatic to discuss", it also
seems to me that introducing polymorphic URL processing might make
things more confusing and error-prone.

The bigger problem seems to be that we're revisiting the design
discussion about urllib.parse from the summer of 2008.  See
http://bugs.python.org/issue3300 if you want to recall how we hashed
this out 2 years ago.  I didn't particularly like that design, but I had
to go off on vacation :-), and things got settled while I was away.  I
haven't heard much from Matt Giuca since he stopped by and lobbed that
patch into the standard library.

But since Guido is the one who settled it, why are we talking about it
again?

Bill

From ianb at colorstudy.com  Wed Jun 23 18:49:13 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Wed, 23 Jun 2010 11:49:13 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTikwIxoxWas-rCz508GIozIueuu6bs-Pk0edao-T@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTikwIxoxWas-rCz508GIozIueuu6bs-Pk0edao-T@mail.gmail.com>
Message-ID: <AANLkTimJQZLd7eaJ7txX2hbzXVueQGBAxh-gJQ6hq4hE@mail.gmail.com>

Oops, I forgot some important quoting (important for the algorithm,
maybe not actually for the discussion)...

from urllib.parse import urlsplit, urlunsplit
import encodings.idna

# urllib.parse.quote both always returns str, and is not as
conservative in quoting as required here...
def quote_unsafe_bytes(b):
    result = []
    for c in b:
        if c < 0x20 or c >= 0x80:
            result.extend(('%%%02X' % c).encode('ASCII'))
        else:
            result.append(c)
    return bytes(result)

def encode_http_url(url, page_encoding='ASCII', errors='strict'):
??? scheme, netloc, path, query, fragment = urlsplit(url)
??? scheme = scheme.encode('ASCII', errors)
??? auth = port = None
??? if '@' in netloc:
??????? auth, netloc = netloc.split('@', 1)
??? if ':' in netloc:
??????? netloc, port = netloc.split(':', 1)
? ? netloc = encodings.idna.ToASCII(netloc)
??? if port:
??????? netloc = netloc + b':' + port.encode('ASCII', errors)
??? if auth:
??????? netloc = quote_unsafe_bytes(auth.encode('UTF-8', errors)) +
b'@' + netloc
??? path = quote_unsafe_bytes(path.encode('UTF-8', errors))
??? query = quote_unsafe_bytes(query.encode(page_encoding, errors))
??? fragment = quote_unsafe_bytes(fragment.encode('UTF-8', errors))
??? return urlunsplit_bytes((scheme, netloc, path, query, fragment))


--
Ian Bicking ?| ?http://blog.ianbicking.org

From glyph at twistedmatrix.com  Wed Jun 23 03:01:17 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Tue, 22 Jun 2010 21:01:17 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim3Gsuh_idn4UG8AvvQTMlM00b2M-WxVhxoudnN@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<94700B9C-25B4-4A75-BA43-20FEA3FDE772@twistedmatrix.com>
	<AANLkTim3Gsuh_idn4UG8AvvQTMlM00b2M-WxVhxoudnN@mail.gmail.com>
Message-ID: <CA06A7B6-C2E3-45F0-BC02-CD3491BC03C8@twistedmatrix.com>


On Jun 22, 2010, at 8:57 PM, Robert Collins wrote:

> bzr has a cache of decoded strings in it precisely because decode is
> slow. We accept slowness encoding to the users locale because thats
> typically much less data to examine than we've examined while
> generating the commit/diff/whatever. We also face memory pressure on a
> regular basis, and that has been, at least partly, due to UCS4 - our
> translation cache helps there because we have less duplicate UCS4
> strings.

Thanks for setting the record straight - apologies if I missed this earlier in the thread.  It does seem vaguely familiar.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100622/889be45e/attachment-0001.html>

From tjreedy at udel.edu  Wed Jun 23 19:38:05 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Wed, 23 Jun 2010 13:38:05 -0400
Subject: [Python-Dev] WPython 1.1 was released
In-Reply-To: <AANLkTik-L_dmJnhS81jgUO6wHYu-73-twkrYjFgloHyf@mail.gmail.com>
References: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>	<201006232112.41047.steve@pearwood.info>
	<AANLkTik-L_dmJnhS81jgUO6wHYu-73-twkrYjFgloHyf@mail.gmail.com>
Message-ID: <hvtgpu$qoh$1@dough.gmane.org>

On 6/23/2010 7:28 AM, Cesare Di Mauro wrote:

> sorry, I made a mistake, assuming that the project was known.

A common mistake of people who announce their projects ;-)
Someone recently make the same mistake on python-list with respect to a 
'BDD' package (the Wikipedia suggests about 6 possible expansions of the 
acronym.
>
> WPython is a CPython 2.6.4 implementation that uses "wordcodes" instead
> of bytecodes. A wordcode is a word (16 bits, two bytes, in this case)

I suggest you specify the base version (2.6.4) on the project page as 
that would be very relevant to many who visit. One should not have to 
download and look at the source to discover to discover if they should 
bother downloading the code. Perhaps also add a sentence as to the 
choice (why not 3.1?).


-- 
Terry Jan Reedy


From cesare.di.mauro at gmail.com  Wed Jun 23 19:53:46 2010
From: cesare.di.mauro at gmail.com (Cesare Di Mauro)
Date: Wed, 23 Jun 2010 19:53:46 +0200
Subject: [Python-Dev] WPython 1.1 was released
In-Reply-To: <hvtgpu$qoh$1@dough.gmane.org>
References: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>
	<201006232112.41047.steve@pearwood.info>
	<AANLkTik-L_dmJnhS81jgUO6wHYu-73-twkrYjFgloHyf@mail.gmail.com>
	<hvtgpu$qoh$1@dough.gmane.org>
Message-ID: <AANLkTilPBsYFOKWmpq5yXfSwzbAThT96DHKNOq-3Z6Uo@mail.gmail.com>

2010/6/23 Terry Reedy <tjreedy at udel.edu>

> On 6/23/2010 7:28 AM, Cesare Di Mauro wrote:
> WPython is a CPython 2.6.4 implementation that uses "wordcodes" instead
> of bytecodes. A wordcode is a word (16 bits, two bytes, in this case)
>
> I suggest you specify the base version (2.6.4) on the project page as that
> would be very relevant to many who visit. One should not have to download
> and look at the source to discover to discover if they should bother
> downloading the code. Perhaps also add a sentence as to the choice (why not
> 3.1?).
>
> --
> Terry Jan Reedy


Thanks for the suggestions. I've updated the main project accordingly. :)

Cesare
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/f6086f45/attachment.html>

From tseaver at palladion.com  Wed Jun 23 20:23:33 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 23 Jun 2010 14:23:33 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <13837.1277311608@parc.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>	<hvt9af$t6n$1@dough.gmane.org>	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<13837.1277311608@parc.com>
Message-ID: <hvtjf5$4vs$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Bill Janssen wrote:

> The bigger problem seems to be that we're revisiting the design
> discussion about urllib.parse from the summer of 2008.  See
> http://bugs.python.org/issue3300 if you want to recall how we hashed
> this out 2 years ago.  I didn't particularly like that design, but I had
> to go off on vacation :-), and things got settled while I was away.  I
> haven't heard much from Matt Giuca since he stopped by and lobbed that
> patch into the standard library.
> 
> But since Guido is the one who settled it, why are we talking about it
> again?

Perhaps such decisions need revisiting in light of subsequent experience
/ pain / learning.  E.g:

- - the repeated inability of the web-sig to converge on appropriate
  semantics for a Python3-compatible version of the WSGI spec;

- - the subsequent quirkiness of the Python3 wsgiref implementation;

- - the breakage in cgi.py which prevents handling file uploads in a
  web application;

- - the slow adoption / porting rate of major web frameworks and libraries
  to Python 3.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwiUSAACgkQ+gerLs4ltQ49EwCeLYwrZs6QfairPP5zpeeUlxao
qg8An37kRz1CrzGc3kScvSqVx8FPnO1M
=lR6R
-----END PGP SIGNATURE-----


From martin at v.loewis.de  Wed Jun 23 20:29:44 2010
From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 23 Jun 2010 20:29:44 +0200
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
Message-ID: <4C225298.9010701@v.loewis.de>

> The problem that _DARWIN_C_SOURCE introduces is that it replaces
> system getgroups with a database query effectively making the true
> process' list of supplementary group IDs inaccessible to programs.
> See source code at
> <http://www.opensource.apple.com/source/Libc/Libc-594.1.4/sys/getgroups.c>.

If that is true (i.e. the file is really the one that is being used),
I think this is a severe flaw in OSX's implementation of the POSIX 
specification.

Then, I agree that Python, in turn, should make sure that 
posix.getgroups is really the POSIX version of getgroups, not the Apple 
version. This is a general principle: if the system has two competing 
implementations of some API, the Python posix module should strive to 
call the POSIX version of the API. If the vendor's version of the API is 
also useful, it can be exposed under a different name (if, in turn, this 
is technically possible).

Just my 0.02?.

Regards,
Martin

From glyph at twistedmatrix.com  Wed Jun 23 20:31:41 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Wed, 23 Jun 2010 14:31:41 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <4C21FB50.1080905@holdenweb.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>
	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>
	<hvijae$9tc$1@dough.gmane.org>	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>
	<hvqorq$i69$1@dough.gmane.org>
	<AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>
	<4C21FB50.1080905@holdenweb.com>
Message-ID: <9A9D719C-0ED5-4061-B314-06450CC965BB@twistedmatrix.com>


On Jun 23, 2010, at 8:17 AM, Steve Holden wrote:

> Guido van Rossum wrote:
>> On Tue, Jun 22, 2010 at 9:37 AM, Tres Seaver <tseaver at palladion.com> wrote:
>>> Any "turdiness" (which I am *not* arguing for) is a natural consequence
>>> of the kinds of backward incompatibilities which were *not* ruled out
>>> for Python 3, along with the (early, now waning) "build it and they will
>>> come" optimism about adoption rates.
>> 
>> FWIW, my optimisim is *not* waning. I think it's good that we're
>> having this discussion and I expect something useful will come out of
>> it; I also expect in general that the (admittedly serious) problem of
>> having to port all dependencies will be solved in the next few years.
>> Not by magic, but because many people are taking small steps in the
>> right direction, and there will be light eventually. In the mean time
>> I don't blame anyone for sticking with 2.x or being too busy to help
>> port stuff to 3.x. Python 3 has been a long time in the making -- it
>> will be a bit longer still, which was expected.
>> 
> +1
> 
> The important thing is to avoid bigotry and FUD, and deal with things
> the way they are. The #python IRC team have just helped us make a major
> step forward. This won't be a campaign with a victorious charge over
> some imaginary finish line.

For sure.

I don't speak for Tres, but I don't think he wasn't talking about optimism about *adoption*, overall, but optimism about adoption *rates*.  And I don't think he was talking about it coming from Guido :).

There has definitely been some "irrational exuberance" from some quarters.  The form it usually takes is someone making a blog post which assumes, because the author could port their smallish library or application without too much hassle, that Python 2.x is already dead and everyone should be off of it in a couple of weeks.

I've never heard this position from the core team or any official communication or documentation.  Far from it: the realistic attitude that the Python 3 migration is something that will take a while has significantly reduced my own concerns.

Even the aforementioned blog posts have been encouraging in some ways, because a lot of people are reporting surprisingly easy transitions.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/905712b5/attachment.html>

From tseaver at palladion.com  Wed Jun 23 20:40:47 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Wed, 23 Jun 2010 14:40:47 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <9A9D719C-0ED5-4061-B314-06450CC965BB@twistedmatrix.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<20100618204831.A8F2A3A40A5@sparrow.telecommunity.com>	<AANLkTiklnpy1zbh0usp2AZ6LEBuSOa1C0XxnjlTZwwH7@mail.gmail.com>	<hvijae$9tc$1@dough.gmane.org>	<609CF661-AB50-49FC-BAA9-B8898C1E9A19@gmail.com>	<hvqorq$i69$1@dough.gmane.org>	<AANLkTilfd__XUG4coogDG61tnFabhUL6ZnvbS7LJFo2O@mail.gmail.com>	<4C21FB50.1080905@holdenweb.com>
	<9A9D719C-0ED5-4061-B314-06450CC965BB@twistedmatrix.com>
Message-ID: <hvtkfg$8i8$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Glyph Lefkowitz wrote:

> I don't speak for Tres, but I don't think he wasn't talking about
> optimism about *adoption*, overall, but optimism about adoption
> *rates*.  And I don't think he was talking about it coming from Guido
> :).

You channel me correctly here.  In particular, the phrase "build it and
they will come" was meant to address the idea that the only thing needed
to drive adoption was the release of the new, shiny Python3.  That
particular bit of optimism is what I meant to describe as waning:  the
community on the whole seems to be more realistic now than two or three
years ago about the kind of extra effort required from both core
developers and from existing Python 2 folks to get to Python 3.

> There has definitely been some "irrational exuberance" from some
> quarters.  The form it usually takes is someone making a blog post
> which assumes, because the author could port their smallish library
> or application without too much hassle, that Python 2.x is already
> dead and everyone should be off of it in a couple of weeks.
> 
> I've never heard this position from the core team or any official
> communication or documentation.  Far from it: the realistic attitude
> that the Python 3 migration is something that will take a while has
> significantly reduced my own concerns.
> 
> Even the aforementioned blog posts have been encouraging in some
> ways, because a lot of people are reporting surprisingly easy
> transitions.

Indeed.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwiVS8ACgkQ+gerLs4ltQ4kQgCeJ9nwU8XyiWzOTpHSbWg21bzU
0/IAnjVOj5SlgA9mnAsx4/wMad5lNkqq
=HObh
-----END PGP SIGNATURE-----


From solipsis at pitrou.net  Wed Jun 23 21:36:45 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 23 Jun 2010 21:36:45 +0200
Subject: [Python-Dev] bytes / unicode
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org> <AANLkTimjunQtAe9ql
	qpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<13837.1277311608@parc.com> <hvtjf5$4vs$1@dough.gmane.org>
Message-ID: <20100623213645.658517d7@pitrou.net>

On Wed, 23 Jun 2010 14:23:33 -0400
Tres Seaver <tseaver at palladion.com> wrote:
> 
> Perhaps such decisions need revisiting in light of subsequent experience
> / pain / learning.  E.g:
> 
> - - the repeated inability of the web-sig to converge on appropriate
>   semantics for a Python3-compatible version of the WSGI spec;
> 
> - - the subsequent quirkiness of the Python3 wsgiref implementation;

The way wsgiref was adapted is admittedly suboptimal. It was totally
broken at first, and PJE didn't want to look very deeply into it. We
therefore had to settle on a series of small modifications that seemed
rather reasonable, but without any in-depth discussion of what WSGI had
to look like under Python 3 (since it was not our job and responsibility).

Therefore, I don't think wsgiref should be taken as a guide to what
a cleaned up, Python 3-specific WSGI must look like.

> - - the slow adoption / porting rate of major web frameworks and libraries
>   to Python 3.

Some of the major web frameworks and libraries have a ton of
dependencies, which would explain why they really haven't bothered yet.

I don't think you can't claim, though, that Python 3 makes things
significantly harder for these frameworks. The proof is that many of
them already give the user unicode strings in Python 2.x. They must
have somehow got the decoding right.

Regards

Antoine.


From ronaldoussoren at mac.com  Wed Jun 23 22:31:42 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Wed, 23 Jun 2010 22:31:42 +0200
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
Message-ID: <02EFB202-505A-405E-AE00-ABC0A2234DDA@mac.com>


On 23 Jun, 2010, at 16:48, Alexander Belopolsky wrote:

> On Wed, Jun 23, 2010 at 2:08 AM, Ronald Oussoren <ronaldoussoren at mac.com> wrote:
> ..
>>> 
>>>> * [Ronald's proposal] results in posix.getgroups not reflecting results of posix.setgroups
>>>> 
>>> 
>>> This effectively substitutes getgrouplist called on the current user
>>> for getgroups.  In 3.x, I believe the correct action will be to
>>> provide direct access to getgrouplist which is while not POSIX (yet?),
>>> is widely available.
>> 
>> I don't mind adding getgrouplist, but that issue is seperator from this one. BTW. Appearently getgrouplist is posix
>> (<http://refspecs.freestandards.org/LSB_3.1.1/LSB-Core-generic/LSB-Core-generic/libc.html>), although this isn't a
>> requirement for being added to the posix module.
>> 
> 
> (The link you provided leads to "Linux Standard Base Core
> Specification," which is different from POSIX, but the distinction is
> not relevant for our discussion.)

I know, but the page claims getgrouplist is in SUS.  I've since looked at what claims to be a copy of SUS: http://www.unix.org/single_unix_specification/ and that does not contain getgrouplist. 

> 
>> 
>> It is still my opinion that the second option is preferable for better compatibility with system tools, even if the patch
>> is more complicated and the library function we use can be considered to be broken.
> 
> Let me try to formulate what the disagreement is.  There are two
> different group lists that can be associated with a running process:
> 1) The list of current supplementary group IDs maintained by the
> system for each process and stored in per-process system tables; and
> 2) The list of the groups that include the uid under which the process
> is running as a member.
> 
> The first list is returned by a system call getgroups and the second
> can be obtained using system database access functions as follows:
> 
> pw = getpwuid(getuid())
> getgrouplist(pw->pw_name, ..)
> 
> The first list can be modified by privileged processes using setgroups
> system call, while the second changes when system databases change.
> 
> The problem that _DARWIN_C_SOURCE introduces is that it replaces
> system getgroups with a database query effectively making the true
> process' list of supplementary group IDs inaccessible to programs.
> See source code at
> <http://www.opensource.apple.com/source/Libc/Libc-594.1.4/sys/getgroups.c>.
> 
> The problem is complicated by the fact that OSX true getgroups call
> appears to truncate the list of groups to NGROUPS_MAX=16.  Note,
> however that it is not clear whether the system call truncates the
> list or the underlying process tables are limited to 16 entries and
> additional groups are ignored when the process is created.
> 
> In my view, getgroups and getgrouplist are two fundamentally different
> operations and both should be provided by the os module.  Redefining
> os.getgroups to invoke getgrouplist instead of system getgroups on one
> particular platform to work around that platform's system call
> limitation is not right.

But we don't redefine os.getgroups to call getgrouplist, it is the system library that
seems to implement getgroups(3) using getgrouplist(3).  I agree that that is odd at best,
but it is IMHO functioning as designed by Apple (that is, Apple choose the pick
the current behavior, they didn't accidently break this).

The previous paragraph is nitpicky, but this is IMO an important distinction.


I've done some more experimentation:

*  compat(5) lies: not setting _DARWIN_C_SOURCE is not the same as settings _DARWIN_C_SOURCE when the deployment target is 10.5, with _DARWIN_C_SOURCE getgroups it translated to the symbol "_getgroups$DARWIN_EXTSN" in the object file, without it is "_getgroups".

* the id(1) command uses the version of getgroups that does not reflect setgroups. Given this script:
import os

os.system("id")
os.setgroups([1])
os.system("id")

Running it gives an unexpected output:

# /usr/bin/python doit.py
uid=0(root) gid=0(wheel) groups=0(wheel),204(_developer),100(_lpoperator),98(_lpadmin),80(admin),61(localaccounts),29(certusers),20(staff),12(everyone),9(procmod),8(procview),5(operator),4(tty),3(sys),2(kmem),1(daemon),401(com.apple.access_screensharing)
uid=0(root) gid=0(wheel) groups=0(wheel),204(_developer),100(_lpoperator),98(_lpadmin),80(admin),61(localaccounts),29(certusers),20(staff),12(everyone),9(procmod),8(procview),5(operator),4(tty),3(sys),2(kmem),1(daemon),401(com.apple.access_screensharing)

* when I add a group in the Accounts panel in System Preferences and add my account to it the id(1) command immediately reflects the change (as expected given the previous result)

* adding a non-administrator account to a newly created group does not affect filesystem access for existing process (that is, if I created a file that's only readable for the new group and the test user couldn't read that file until I logged out and in again), which means the Account panel doesn't magically alter kernel state for running processes.

* Setting or unsetting _DARWIN_C_SOURCE doesn't affect the contents of pyconfig.h beyond that setting:

$ diff pyconfig.h-DARWIN_C_SOURCE pyconfig.h-NO_DARWIN_SOURCE 
1124c1124
< #define _DARWIN_C_SOURCE 1
---
> /* #undef _DARWIN_C_SOURCE */

"pyconfig.h-DARWIN_C_SOURCE" is generated by the current configure script, the other one is generated by a configure script that was patched to not yet _DARWIN_C_SOURCE (by removing "AC_DEFINE(_DARWIN_C_SOURCE, 1, [Define on Darwin to activate all library features])" from configure.in and regenerating configure).  Both were generated using "configure MACOSX_DEPLOYMENT_TARGET=10.5".

* setgroups(3) cannot set more than 16 groups, that is "setgroups(17, gidset)" will always return EINVAL (this is on OSX 10.6.4). I've verified this using a C program that directly calls the right APIs.   

I'm busy with projects for the rest of the week and won't be able to do anything python-dev related until Sunday.

Ronald


-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/90441445/attachment-0001.bin>

From a.badger at gmail.com  Wed Jun 23 23:30:22 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Wed, 23 Jun 2010 17:30:22 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100623213645.658517d7@pitrou.net>
References: <20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<13837.1277311608@parc.com> <hvtjf5$4vs$1@dough.gmane.org>
	<20100623213645.658517d7@pitrou.net>
Message-ID: <20100623213022.GB3470@unaka.lan>

On Wed, Jun 23, 2010 at 09:36:45PM +0200, Antoine Pitrou wrote:
> On Wed, 23 Jun 2010 14:23:33 -0400
> Tres Seaver <tseaver at palladion.com> wrote:
> > - - the slow adoption / porting rate of major web frameworks and libraries
> >   to Python 3.
> 
> Some of the major web frameworks and libraries have a ton of
> dependencies, which would explain why they really haven't bothered yet.
> 
> I don't think you can't claim, though, that Python 3 makes things
> significantly harder for these frameworks. The proof is that many of
> them already give the user unicode strings in Python 2.x. They must
> have somehow got the decoding right.
> 
Note that this assumption seems optimistic to me.  I started talking to Graham
Dumpleton, author of mod_wsgi a couple years back because mod_wsgi and paste
do decoding of bytes to unicode at different layers which caused problems
for application level code that should otherwise run fine when being served
by mod_wsgi or paste httpserver.  That was the beginning of Graham starting
to talk about what the wsgi spec really should look like under python3
instead of the broken way that the appendix to the current wsgi spec states.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/1097e6f7/attachment.pgp>

From solipsis at pitrou.net  Wed Jun 23 23:35:12 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 23 Jun 2010 23:35:12 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100623213022.GB3470@unaka.lan>
References: <20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<13837.1277311608@parc.com> <hvtjf5$4vs$1@dough.gmane.org>
	<20100623213645.658517d7@pitrou.net>
	<20100623213022.GB3470@unaka.lan>
Message-ID: <20100623233512.50b5b710@pitrou.net>

On Wed, 23 Jun 2010 17:30:22 -0400
Toshio Kuratomi <a.badger at gmail.com> wrote:
> Note that this assumption seems optimistic to me.  I started talking to Graham
> Dumpleton, author of mod_wsgi a couple years back because mod_wsgi and paste
> do decoding of bytes to unicode at different layers which caused problems
> for application level code that should otherwise run fine when being served
> by mod_wsgi or paste httpserver.  That was the beginning of Graham starting
> to talk about what the wsgi spec really should look like under python3
> instead of the broken way that the appendix to the current wsgi spec states.

Ok, but the reason would be that the WSGI spec is broken. Not Python 3
itself.

Regards

Antoine.

From henry at precheur.org  Wed Jun 23 23:35:38 2010
From: henry at precheur.org (Henry Precheur)
Date: Wed, 23 Jun 2010 14:35:38 -0700
Subject: [Python-Dev] [Web-SIG] bytes / unicode
In-Reply-To: <20100623213645.658517d7@pitrou.net>
References: <20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<13837.1277311608@parc.com> <hvtjf5$4vs$1@dough.gmane.org>
	<20100623213645.658517d7@pitrou.net>
Message-ID: <20100623213538.GB9501@banane.novuscom.net>

On Wed, Jun 23, 2010 at 09:36:45PM +0200, Antoine Pitrou wrote:
> I don't think you can't claim, though, that Python 3 makes things
> significantly harder for these frameworks. The proof is that many of
> them already give the user unicode strings in Python 2.x. They must
> have somehow got the decoding right.

Well... Frameworks usually 'simplify' the problem by partly ignoring it.
By default they assume the data in the request in UTF-8. You can specify
an alternative encoding in most of them. Django [1], Werkzeug [2], and
WebOb [3] do that.

The problem with this approach is that you still have to deal with weird
requests where one thing is unicode, and another is latin-1. Sometime
you can even have 2 different encodings in a single header like Cookies.
There's no solution to this problem, it has to be solved on a case by
case basis.

There was a big discussion a while ago on web-sig. I think the consensus
was that WSGI for Python 3 should assume that the data is encoded in
latin-1 since it's the default encoding according to the RFC.


[1] http://docs.djangoproject.com/en/dev/ref/request-response/#django.http.HttpRequest.encoding
[2] http://werkzeug.pocoo.org/documentation/dev/unicode.html#request-and-response-objects
[3] http://pythonpaste.org/webob/reference.html#unicode-variables

-- 
  Henry Pr?cheur

From tullarisc256 at gmail.com  Wed Jun 23 21:08:52 2010
From: tullarisc256 at gmail.com (tullarisc)
Date: Wed, 23 Jun 2010 12:08:52 -0700 (PDT)
Subject: [Python-Dev]  swig/python and intel's threadedbuildginblocks
Message-ID: <28975580.post@talk.nabble.com>


Hi,

I've compiled intel's OSS threadedbuidlingblocks library on OpenBSD
and put everything in some swig interfaces.

Here you go: http://tullarisc.xtreemhost.com/swig.ttb.tgz

Love, tullarisc.
-- 
View this message in context: http://old.nabble.com/swig-python-and-intel%27s-threadedbuildginblocks-tp28975580p28975580.html
Sent from the Python - python-dev mailing list archive at Nabble.com.


From brett at python.org  Wed Jun 23 23:53:36 2010
From: brett at python.org (Brett Cannon)
Date: Wed, 23 Jun 2010 14:53:36 -0700
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
Message-ID: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>

I finally realized why clang has not been silencing its warnings about
unused return values: I have -Wno-unused-value set in CFLAGS which
comes before OPT (which defines -Wall) as set in PY_CFLAGS in
Makefile.pre.in.

I could obviously set OPT in my environment, but that would override
the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
but the README says that's for stuff that tweak binary compatibility.

So basically what I am asking is what environment variable should I
use? If CFLAGS is correct then does anyone have any issues if I change
the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
after OPT?

From a.badger at gmail.com  Thu Jun 24 00:57:40 2010
From: a.badger at gmail.com (Toshio Kuratomi)
Date: Wed, 23 Jun 2010 18:57:40 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100623233512.50b5b710@pitrou.net>
References: <AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<13837.1277311608@parc.com> <hvtjf5$4vs$1@dough.gmane.org>
	<20100623213645.658517d7@pitrou.net>
	<20100623213022.GB3470@unaka.lan>
	<20100623233512.50b5b710@pitrou.net>
Message-ID: <20100623225740.GC3470@unaka.lan>

On Wed, Jun 23, 2010 at 11:35:12PM +0200, Antoine Pitrou wrote:
> On Wed, 23 Jun 2010 17:30:22 -0400
> Toshio Kuratomi <a.badger at gmail.com> wrote:
> > Note that this assumption seems optimistic to me.  I started talking to Graham
> > Dumpleton, author of mod_wsgi a couple years back because mod_wsgi and paste
> > do decoding of bytes to unicode at different layers which caused problems
> > for application level code that should otherwise run fine when being served
> > by mod_wsgi or paste httpserver.  That was the beginning of Graham starting
> > to talk about what the wsgi spec really should look like under python3
> > instead of the broken way that the appendix to the current wsgi spec states.
> 
> Ok, but the reason would be that the WSGI spec is broken. Not Python 3
> itself.
> 
Agreed.  Neither python2 nor python3 is broken.  It's the wsgi spec and the
implementation of that spec where things fall down.  From your first post,
I thought you were claiming that python3 was broken since web frameworks got
decoding right on python2 and I just wanted to defend python3 by showing
that python2 wasn't all sunshine and roses.

-Toshio
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 198 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100623/413498dd/attachment.pgp>

From foom at fuhm.net  Thu Jun 24 02:26:25 2010
From: foom at fuhm.net (James Y Knight)
Date: Wed, 23 Jun 2010 20:26:25 -0400
Subject: [Python-Dev] Use of cgi.escape can lead to XSS vulnerabilities
In-Reply-To: <AANLkTil9vEOPLgenyPOu-F3OuC-R61ak_KoEvNARrGvW@mail.gmail.com>
References: <AANLkTil9vEOPLgenyPOu-F3OuC-R61ak_KoEvNARrGvW@mail.gmail.com>
Message-ID: <09E6BE78-066E-4BCF-AA34-C6286CF8AB98@fuhm.net>


On Jun 22, 2010, at 5:14 PM, Craig Younkins wrote:

> I suggest rewording the documentation for the method making it more  
> clear what it should and should not be used for. I would like to see  
> the method changed to properly escape single-quotes, but if it is  
> not changed, the documentation should explicitly say this method  
> does not make input safe for inclusion in HTML.

Well, it *does* make the input safe for inclusion in HTML...in a  
double-quoted attribute.

The docs could make it clearer that you should always use double- 
quotes around your attribute values when using it, though, I agree.

From janssen at parc.com  Thu Jun 24 03:26:46 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 23 Jun 2010 18:26:46 PDT
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <02EFB202-505A-405E-AE00-ABC0A2234DDA@mac.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
	<02EFB202-505A-405E-AE00-ABC0A2234DDA@mac.com>
Message-ID: <1366.1277342806@parc.com>

See also http://gimper.net/viewtopic.php?f=18&t=3185.

Bill

From ronaldoussoren at mac.com  Thu Jun 24 08:10:42 2010
From: ronaldoussoren at mac.com (Ronald Oussoren)
Date: Thu, 24 Jun 2010 08:10:42 +0200
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <1366.1277342806@parc.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
	<02EFB202-505A-405E-AE00-ABC0A2234DDA@mac.com>
	<1366.1277342806@parc.com>
Message-ID: <F15CC65E-2605-4D97-BAEF-8C652CB9EE7D@mac.com>


On 24 Jun, 2010, at 3:26, Bill Janssen wrote:

> See also http://gimper.net/viewtopic.php?f=18&t=3185.

That's because setgroups(3) is limited to 16 groups (that is, the kernel doesn't support more than 16 groups at all).

Ronald

-------------- next part --------------
A non-text attachment was scrubbed...
Name: smime.p7s
Type: application/pkcs7-signature
Size: 3567 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/a51a131c/attachment.bin>

From greg.ewing at canterbury.ac.nz  Thu Jun 24 09:20:34 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Thu, 24 Jun 2010 19:20:34 +1200
Subject: [Python-Dev] os.getgroups() on MacOS X Was: red buildbots on 2.7
In-Reply-To: <F15CC65E-2605-4D97-BAEF-8C652CB9EE7D@mac.com>
References: <AANLkTili_NL9_Ob75cx0eH1LbcTHlfYKuMIG5lMDG_Z7@mail.gmail.com>
	<02EFB202-505A-405E-AE00-ABC0A2234DDA@mac.com>
	<1366.1277342806@parc.com>
	<F15CC65E-2605-4D97-BAEF-8C652CB9EE7D@mac.com>
Message-ID: <4C230742.40103@canterbury.ac.nz>

Ronald Oussoren wrote:

> That's because setgroups(3) is limited to 16 groups 
 > (that is, the kernel doesn't support more than 16 groups at all).

So how does an account being a member of 18 groups ever work?

-- 
Greg

From stephen at xemacs.org  Thu Jun 24 10:12:13 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Thu, 24 Jun 2010 17:12:13 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
Message-ID: <87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:

 > For example: how we can make the suite of functions used for URL
 > processing more polymorphic, so that each developer can choose for
 > herself how URLs need to be treated in her application.

While you have come down on the side of polymorphism (as opposed to
separate functions), I'm a little nervous about it.  Specifically,
Philip Eby expressed a desire for earlier type errors, while
polymorphism seems to ensure that you'll need to Look Before You Leap
to get early error detection.


From regebro at gmail.com  Thu Jun 24 11:05:03 2010
From: regebro at gmail.com (Lennart Regebro)
Date: Thu, 24 Jun 2010 11:05:03 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
Message-ID: <AANLkTin2I-DMfS9q3SBPuY_URinhuP4umYDYq-D3qjiQ@mail.gmail.com>

On Tue, Jun 22, 2010 at 20:07, James Y Knight <foom at fuhm.net> wrote:
> Yeah. This is a real issue I have with the direction Python3 went: it pushes
> you into decoding everything to unicode early, even when you don't care --

Well, yes, maybe even if *you* don't care. But often the functions you
need to call must care, and then you need to decode to unicode, even
if you personally don't care. And in those cases, you should deocde as
early as possible.

In the cases where neither you nor the functions you call care, then
you don't have to decode, and you can happily pass binary data from
one function to another.

So this is not really a question of the direction Python 3 went. It's
more a case that some methods that *could* do their transformations in
a well defined way on bytes don't, and then force you to decode to
unicode. But that's not a problem with direction, it's just a missing
feature in the stdlib.

-- 
Lennart Regebro: http://regebro.wordpress.com/
Python 3 Porting: http://python3porting.com/
+33 661 58 14 64

From mal at egenix.com  Thu Jun 24 12:58:23 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Thu, 24 Jun 2010 12:58:23 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTin2I-DMfS9q3SBPuY_URinhuP4umYDYq-D3qjiQ@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<AANLkTin2I-DMfS9q3SBPuY_URinhuP4umYDYq-D3qjiQ@mail.gmail.com>
Message-ID: <4C233A4F.2030607@egenix.com>

Lennart Regebro wrote:
> On Tue, Jun 22, 2010 at 20:07, James Y Knight <foom at fuhm.net> wrote:
>> Yeah. This is a real issue I have with the direction Python3 went: it pushes
>> you into decoding everything to unicode early, even when you don't care --
> 
> Well, yes, maybe even if *you* don't care. But often the functions you
> need to call must care, and then you need to decode to unicode, even
> if you personally don't care. And in those cases, you should deocde as
> early as possible.
> 
> In the cases where neither you nor the functions you call care, then
> you don't have to decode, and you can happily pass binary data from
> one function to another.
> 
> So this is not really a question of the direction Python 3 went. It's
> more a case that some methods that *could* do their transformations in
> a well defined way on bytes don't, and then force you to decode to
> unicode. But that's not a problem with direction, it's just a missing
> feature in the stdlib.

The discussion is showing that in at least a few application spaces,
the stdlib should be able to work on both bytes and Unicode, preferably
using the same interfaces using polymorphism, i.e.

some_function(bytes) -> bytes
some_function(str) -> str

In Python2 this partially works due to the automatic bytes->str
conversion (in some cases you get some_function(bytes) -> str),
the codec base class implementations being a prime example.

In Python3, things have to be done explicity and I think we need
to add a few helpers to make writing such str/bytes interfaces
easier.

We've already had some suggestions in that area, but probably need
to collect a few more ideas based on real-life porting attempts.

I'd like to make this a topic at the upcoming language summit
in Birmingham, if Michael agrees.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 24 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                24 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From fuzzyman at voidspace.org.uk  Thu Jun 24 13:00:12 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Thu, 24 Jun 2010 12:00:12 +0100
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C233A4F.2030607@egenix.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100621165611.GW5787@unaka.lan>	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>	<20100622055040.GE5787@unaka.lan>	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>	<AANLkTin2I-DMfS9q3SBPuY_URinhuP4umYDYq-D3qjiQ@mail.gmail.com>
	<4C233A4F.2030607@egenix.com>
Message-ID: <4C233ABC.40702@voidspace.org.uk>

On 24/06/2010 11:58, M.-A. Lemburg wrote:
> Lennart Regebro wrote:
>    
>> On Tue, Jun 22, 2010 at 20:07, James Y Knight<foom at fuhm.net>  wrote:
>>      
>>> Yeah. This is a real issue I have with the direction Python3 went: it pushes
>>> you into decoding everything to unicode early, even when you don't care --
>>>        
>> Well, yes, maybe even if *you* don't care. But often the functions you
>> need to call must care, and then you need to decode to unicode, even
>> if you personally don't care. And in those cases, you should deocde as
>> early as possible.
>>
>> In the cases where neither you nor the functions you call care, then
>> you don't have to decode, and you can happily pass binary data from
>> one function to another.
>>
>> So this is not really a question of the direction Python 3 went. It's
>> more a case that some methods that *could* do their transformations in
>> a well defined way on bytes don't, and then force you to decode to
>> unicode. But that's not a problem with direction, it's just a missing
>> feature in the stdlib.
>>      
> The discussion is showing that in at least a few application spaces,
> the stdlib should be able to work on both bytes and Unicode, preferably
> using the same interfaces using polymorphism, i.e.
>
> some_function(bytes) ->  bytes
> some_function(str) ->  str
>
> In Python2 this partially works due to the automatic bytes->str
> conversion (in some cases you get some_function(bytes) ->  str),
> the codec base class implementations being a prime example.
>
> In Python3, things have to be done explicity and I think we need
> to add a few helpers to make writing such str/bytes interfaces
> easier.
>
> We've already had some suggestions in that area, but probably need
> to collect a few more ideas based on real-life porting attempts.
>
> I'd like to make this a topic at the upcoming language summit
> in Birmingham, if Michael agrees.
>
>    
Yep, it sounds like a great topic for the language summit.

Michael

-- 
http://www.ironpythoninaction.com/


From guido at python.org  Thu Jun 24 16:33:42 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 Jun 2010 07:33:42 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com> 
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>

On Thu, Jun 24, 2010 at 1:12 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:
> Guido van Rossum writes:
>
> ?> For example: how we can make the suite of functions used for URL
> ?> processing more polymorphic, so that each developer can choose for
> ?> herself how URLs need to be treated in her application.
>
> While you have come down on the side of polymorphism (as opposed to
> separate functions), I'm a little nervous about it. ?Specifically,
> Philip Eby expressed a desire for earlier type errors, while
> polymorphism seems to ensure that you'll need to Look Before You Leap
> to get early error detection.

Understood, but both the majority of str/bytes methods and several
existing APIs (e.g. many in the os module, like os.listdir()) do it
this way.

Also, IMO a polymorphic function should *not* accept *mixed*
bytes/text input -- join('x', b'y') should be rejected. But join('x',
'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make sense to me.

So, actually, I *don't* understand what you mean by needing LBYL.

-- 
--Guido van Rossum (python.org/~guido)

From ncoghlan at gmail.com  Thu Jun 24 17:25:18 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 Jun 2010 01:25:18 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>
Message-ID: <AANLkTimztrpXVK2rK9r8qhAGU8Epyh_M0sZlzx6Jb_St@mail.gmail.com>

On Fri, Jun 25, 2010 at 12:33 AM, Guido van Rossum <guido at python.org> wrote:
> Also, IMO a polymorphic function should *not* accept *mixed*
> bytes/text input -- join('x', b'y') should be rejected. But join('x',
> 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make sense to me.

A policy of allowing arguments to be either str or bytes, but not a
mixture, actually avoids one of the more painful aspects of the 2.x
"promote mixed operations to unicode" approach. Specifically, you
either had to scan all the arguments up front to check for unicode, or
else you had to stop what you were doing and start again with the
unicode version if you encountered unicode partway through. Neither
was particularly nice to implement.

As you noted elsewhere, literals and string methods are still likely
to be a major sticking point with that approach - common operations
like ''.join(seq) and b''.join(seq) aren't polymorphic, so functions
that use them won't be polymorphic either. (It's only the str->unicode
promotion behaviour in 2.x that works around this problem there).

Would it be heretical to suggest that sum() be allowed to work on
strings to at least eliminate ''.join() as something that breaks bytes
processing? It already works for bytes, although it then fails with a
confusing message for bytearray:

>>> sum(b"a b c".split(), b'')
b'abc'

>>> sum(bytearray(b"a b c").split(), bytearray(b''))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() can't sum bytes [use b''.join(seq) instead]

>>> sum("a b c".split(), '')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: sum() can't sum strings [use ''.join(seq) instead]

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From guido at python.org  Thu Jun 24 17:41:14 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 Jun 2010 08:41:14 -0700
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTimztrpXVK2rK9r8qhAGU8Epyh_M0sZlzx6Jb_St@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com> 
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com> 
	<AANLkTimztrpXVK2rK9r8qhAGU8Epyh_M0sZlzx6Jb_St@mail.gmail.com>
Message-ID: <AANLkTinR08fCedIzZfG4It4FrTHF15S_m_wY-j1i5NmG@mail.gmail.com>

On Thu, Jun 24, 2010 at 8:25 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Jun 25, 2010 at 12:33 AM, Guido van Rossum <guido at python.org> wrote:
>> Also, IMO a polymorphic function should *not* accept *mixed*
>> bytes/text input -- join('x', b'y') should be rejected. But join('x',
>> 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make sense to me.
>
> A policy of allowing arguments to be either str or bytes, but not a
> mixture, actually avoids one of the more painful aspects of the 2.x
> "promote mixed operations to unicode" approach. Specifically, you
> either had to scan all the arguments up front to check for unicode, or
> else you had to stop what you were doing and start again with the
> unicode version if you encountered unicode partway through. Neither
> was particularly nice to implement.

Right. Polymorphic functions should *not* allow mixing text and bytes.
It's all text or all bytes.

> As you noted elsewhere, literals and string methods are still likely
> to be a major sticking point with that approach - common operations
> like ''.join(seq) and b''.join(seq) aren't polymorphic, so functions
> that use them won't be polymorphic either. (It's only the str->unicode
> promotion behaviour in 2.x that works around this problem there).
>
> Would it be heretical to suggest that sum() be allowed to work on
> strings to at least eliminate ''.join() as something that breaks bytes
> processing? It already works for bytes, although it then fails with a
> confusing message for bytearray:
>
>>>> sum(b"a b c".split(), b'')
> b'abc'
>
>>>> sum(bytearray(b"a b c").split(), bytearray(b''))
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: sum() can't sum bytes [use b''.join(seq) instead]
>
>>>> sum("a b c".split(), '')
> Traceback (most recent call last):
> ?File "<stdin>", line 1, in <module>
> TypeError: sum() can't sum strings [use ''.join(seq) instead]

I don't think we should abuse sum for this. A simple idiom to get the
*empty* string of a particular type is x[:0] so you could write
something like this to concatenate a list or strings or bytes:
xs[:0].join(xs). Note that if xs is empty we wouldn't know what to do
anyway so this should be disallowed.

-- 
--Guido van Rossum (python.org/~guido)

From barry at python.org  Thu Jun 24 17:50:48 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 11:50:48 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
Message-ID: <20100624115048.4fd152e3@heresy>

This is a follow up to PEP 3147.  That PEP, already implemented in Python 3.2,
allows for Python source files from different Python versions to live together
in the same directory.  It does this by putting a magic tag in the .pyc file
name and placing the .pyc file in a __pycache__ directory.

Distros such as Debian and Ubuntu will use this to greatly simplifying
deploying Python, and Python applications and libraries.  Debian and Ubuntu
usually ship more than one version of Python, and currently have to play
complex games with symlinks to make this work.  PEP 3147 will go a long way to
eliminating the need for extra directories and symlinks.

One more thing I've found we need though, is a way to handled shared libraries
for extension modules.  Just as we can get name collisions on foo.pyc, we can
get collisions on foo.so.  We obviously cannot install foo.so built for Python
3.2 and foo.so built for Python 3.3 in the same location.  So symlink
nightmare's mini-me is back.

I have a fairly simple fix for this.  I'd actually be surprised if this hasn't
been discussed before, but teh Googles hasn't turned up anything.

The idea is to put the Python version number in the shared library file name,
and extend .so lookup to find these extended file names.  So for example, we'd
see foo.3.2.so instead, and Python would know how to dynload both that and the
traditional foo.so file too (for backward compatibility).

(On file naming: the original patch used foo.so.3.2 and that works just as
well, but I thought there might be tools that expect exactly a '.so' suffix,
so I changed it to put the Major.Minor version number to the left of the
extension.  The exact naming scheme is of course open to debate.)

This is a much simpler patch than PEP 3147, though I'm not 100% sure it's the
right approach.  The way this works is by modifying the configure and
Makefile.pre.in to put the version number in the $SO make variable.  Python
parses its (generated) Makefile to find $SO and it uses this deep in the
bowels of distutils to decide what suffix to use when writing shared libraries
built by 'python setup.py build_ext'.

This means the patched Python only writes versioned .so files by default.  I
personally don't see that as a problem, and it does not affect the test suite,
with the exception of one easily tweaked test.  I don't know if third party
tools will care.  The fact that traditional foo.so shared libraries will still
satisfy the import should be enough, I think.

The patch is currently Linux only, since I need this for Debian and Ubuntu and
wanted to keep the change narrow.

Other possible approaches:
 * Extend the distutils API so that the .so file extension can be passed in,
   instead of being essentially hardcoded to what Python's Makefile contains.
 * Keep the dynload_shlib.c change, but modify the Debian/Ubuntu build
   environment to pass in $SO to make (though the configure.in warning and
   sleep is a little annoying).
 * Add a ./configure option to enable this, which Debuntu's build would use.

The patch is available here:

    http://pastebin.ubuntu.com/454512/

and my working branch is here:

    https://code.edge.launchpad.net/~barry/python/sovers

Please let me know what you think.  I'm happy to just commit this to the py3k
branch if there are no objections <wink>.  I don't think a new PEP is in
order, but an update to PEP 3147 might make sense.

Cheers,
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/cd9bde42/attachment.pgp>

From benjamin at python.org  Thu Jun 24 17:58:09 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 24 Jun 2010 10:58:09 -0500
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624115048.4fd152e3@heresy>
References: <20100624115048.4fd152e3@heresy>
Message-ID: <AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>

2010/6/24 Barry Warsaw <barry at python.org>:
> Please let me know what you think. ?I'm happy to just commit this to the py3k
> branch if there are no objections <wink>. ?I don't think a new PEP is in
> order, but an update to PEP 3147 might make sense.

How will this interact with PEP 384 if that is implemented?


-- 
Regards,
Benjamin

From daniel at stutzbachenterprises.com  Thu Jun 24 18:05:29 2010
From: daniel at stutzbachenterprises.com (Daniel Stutzbach)
Date: Thu, 24 Jun 2010 11:05:29 -0500
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624115048.4fd152e3@heresy>
References: <20100624115048.4fd152e3@heresy>
Message-ID: <AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>

On Thu, Jun 24, 2010 at 10:50 AM, Barry Warsaw <barry at python.org> wrote:

> The idea is to put the Python version number in the shared library file
> name,
> and extend .so lookup to find these extended file names.  So for example,
> we'd
> see foo.3.2.so instead, and Python would know how to dynload both that and
> the
> traditional foo.so file too (for backward compatibility).
>

 What use case does this address?

PEP 3147 addresses the fact that the user may have different versions of
Python installed and each wants to write a .pyc file when loading a module.
 .so files are not generated simply by running the Python interpreter, ergo
.so files are not an issue for that use case.

If you want to make it so a system can install a package in just one
location to be used by multiple Python installations, then the version
number isn't enough.  You also need to distinguish debug builds, profiling
builds, Unicode width (see issue8654), and probably several other
./configure options.
--
Daniel Stutzbach, Ph.D.
President, Stutzbach Enterprises, LLC <http://stutzbachenterprises.com>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/e37fc91b/attachment.html>

From baptiste13z at free.fr  Thu Jun 24 18:58:59 2010
From: baptiste13z at free.fr (Baptiste Carvello)
Date: Thu, 24 Jun 2010 18:58:59 +0200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100621181750.267933A404D@sparrow.telecommunity.com>
References: <h3sa87mevl05p5ro18062010012216@SMTP>	<87sk4jcejy.fsf@uwakimon.sk.tsukuba.ac.jp>	<201006201204.30795.steve@pearwood.info>	<AANLkTin3W3WWH7bfPo9QdlrIMRJOdfKYzFhl2YiPNHHF@mail.gmail.com>	<AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>	<20100620234723.600ad4a8@pitrou.net>	<20100621023005.EE17E3A4099@sparrow.telecommunity.com>	<AANLkTim5XUVC6Hz08Xfh7AX5Tv0CeVfFOaBmxPZuHJRw@mail.gmail.com>	<20100621164650.16A093A414B@sparrow.telecommunity.com>	<AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.com>
	<AANLkTikV3wkEUsr68cov8CIVWy4BZebeQ1O5PVBdI2zM@mail.gmail.c om>
	<20100621181750.267933A404D@sparrow.telecommunity.com>
Message-ID: <i002sk$a2f$1@dough.gmane.org>

P.J. Eby a ?crit :

> [...] stdlib constants are almost always ASCII, 
> and the main use cases for ebytes would involve ascii-extended encodings.)

Then, how about a new "ascii string" literal? This would produce a special kind 
of string that would coerce to a normal string when mixed with a str, and to a 
bytes using ascii codec when mixed with a bytes. Then you could write

 >>> a"/".join(base, path)

and not worry if base and path are both str, or both bytes (mixed being of 
course forbidden).

B.


From pje at telecommunity.com  Thu Jun 24 19:07:01 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Thu, 24 Jun 2010 13:07:01 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100624170856.0853D3A4099@sparrow.telecommunity.com>

At 05:12 PM 6/24/2010 +0900, Stephen J. Turnbull wrote:
>Guido van Rossum writes:
>
>  > For example: how we can make the suite of functions used for URL
>  > processing more polymorphic, so that each developer can choose for
>  > herself how URLs need to be treated in her application.
>
>While you have come down on the side of polymorphism (as opposed to
>separate functions), I'm a little nervous about it.  Specifically,
>Philip Eby expressed a desire for earlier type errors, while
>polymorphism seems to ensure that you'll need to Look Before You Leap
>to get early error detection.

This doesn't have to be in the functions; it can be in the 
*types*.  Mixed-type string operations have to do type checking and 
upcasting already, but if the protocol were open, you could make an 
encoded-bytes type that would handle the error checking.

(Btw, in some earlier emails, Stephen, you implied that this could be 
fixed with codecs -- but it can't, because the problem isn't with the 
bytes containing invalid Unicode, it's with the Unicode containing 
invalid bytes -- i.e., characters that can't be encoded to the 
ultimate codec target.)


From janssen at parc.com  Thu Jun 24 19:38:19 2010
From: janssen at parc.com (Bill Janssen)
Date: Thu, 24 Jun 2010 10:38:19 PDT
Subject: [Python-Dev] thoughts on the bytes/string discussion
Message-ID: <11597.1277401099@parc.com>

Here are a couple of ideas I'm taking away from the bytes/string
discussion.

First, it would probably be a good idea to have a String ABC.

Secondly, maybe the string situation in 2.x wasn't as broken as we
thought it was.  In particular, those who deal with lots of encoded
strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
are more like numbers than we think.  We have separate types for int,
float, Decimal, etc.  But they're all numbers, and they all
cross-operate.  In 2.x, it seems there were two missing features: no
encoding attribute on str, which should have been there and should have
been required, and the default encoding being "ASCII" (I can't tell you
how many times I've had to fix that issue when a non-ASCII encoded str
was passed to some output function).

So maybe having a second string type in 3.x that consists of an encoded
sequence of bytes plus the encoding, call it "estr", wouldn't have been
a bad idea.  It would probably have made sense to have estr cooperate
with the str type, in the same way that two different kinds of numbers
cooperate, "promoting" the result of an operation only when necessary.
This would automatically achieve the kind of polymorphic functionality
that Guido is suggesting, but without losing the ability to do

  x = e(ASCII)"bar"
  a = ''.join("foo", x)

(or whatever the syntax for such an encoded string literal would be --
I'm not claiming this is a good one) which presume would bind "a" to a
Unicode string "foobar" -- have to work out what gets promoted to what.

The language moratorium kind of makes this all theoretical, but building
a String ABC still would be a good start, and presumably isn't forbidden
by the moratorium.

Bill


From brett at python.org  Thu Jun 24 19:48:56 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 24 Jun 2010 10:48:56 -0700
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624115048.4fd152e3@heresy>
References: <20100624115048.4fd152e3@heresy>
Message-ID: <AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>

On Thu, Jun 24, 2010 at 08:50, Barry Warsaw <barry at python.org> wrote:
> This is a follow up to PEP 3147. ?That PEP, already implemented in Python 3.2,
> allows for Python source files from different Python versions to live together
> in the same directory. ?It does this by putting a magic tag in the .pyc file
> name and placing the .pyc file in a __pycache__ directory.
>
> Distros such as Debian and Ubuntu will use this to greatly simplifying
> deploying Python, and Python applications and libraries. ?Debian and Ubuntu
> usually ship more than one version of Python, and currently have to play
> complex games with symlinks to make this work. ?PEP 3147 will go a long way to
> eliminating the need for extra directories and symlinks.
>
> One more thing I've found we need though, is a way to handled shared libraries
> for extension modules. ?Just as we can get name collisions on foo.pyc, we can
> get collisions on foo.so. ?We obviously cannot install foo.so built for Python
> 3.2 and foo.so built for Python 3.3 in the same location. ?So symlink
> nightmare's mini-me is back.
>
> I have a fairly simple fix for this. ?I'd actually be surprised if this hasn't
> been discussed before, but teh Googles hasn't turned up anything.
>
> The idea is to put the Python version number in the shared library file name,
> and extend .so lookup to find these extended file names. ?So for example, we'd
> see foo.3.2.so instead, and Python would know how to dynload both that and the
> traditional foo.so file too (for backward compatibility).
>
> (On file naming: the original patch used foo.so.3.2 and that works just as
> well, but I thought there might be tools that expect exactly a '.so' suffix,
> so I changed it to put the Major.Minor version number to the left of the
> extension. ?The exact naming scheme is of course open to debate.)
>

While the idea is fine with me since I won't have any of my
directories cluttered with multiple .so files, I would still want to
add some moniker showing that the version number represents the
interpreter and not the .so file. If I read "foo.3.2.so", that naively
seems to mean to mean the foo module's 3.2 release is what is in
installed, not that it's built for CPython 3.2. So even though it
might be redundant, I would still want the VM name added.

Adding the VM name also doesn't make extension modules the exclusive
domain of CPython either. If some other VM decides to make their own
.so files that are not binary compatible then we should not preclude
that as this solution it is nothing more than it makes a string
comparison have to look at 7 more characters.

-Brett

P.S.: I wish we could drop use of the 'module.so' variant at the same
time, for consistency sake and to cut out a stat call, but I know that
is asking too much.

From barry at python.org  Thu Jun 24 19:51:19 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 13:51:19 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>
Message-ID: <20100624135119.00b9ac5c@heresy>

On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:

>2010/6/24 Barry Warsaw <barry at python.org>:
>> Please let me know what you think. ?I'm happy to just commit this to the
>> py3k branch if there are no objections <wink>. ?I don't think a new PEP is
>> in order, but an update to PEP 3147 might make sense.
>
>How will this interact with PEP 384 if that is implemented?

Good question, I'd forgotten to mention that PEP.

I think the PEP is a good idea, and worth working on, but it is a longer term
solution to the problem of extension source code compatibility.  It's longer
term because extensions will have to be rewritten to use the new API defined
in PEP 384.  It will take a long time to get this into practice, and
supporting it will be a case-by-case basis.

I'm trying to come up with something that will work immediately while PEP 384
is being adopted.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/a62e71a8/attachment.pgp>

From benjamin at python.org  Thu Jun 24 20:00:54 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Thu, 24 Jun 2010 13:00:54 -0500
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624135119.00b9ac5c@heresy>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>
	<20100624135119.00b9ac5c@heresy>
Message-ID: <AANLkTindH5uADbSwan-xWV08YcDaEKI3CleaFjhdmHvX@mail.gmail.com>

2010/6/24 Barry Warsaw <barry at python.org>:
> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>
>>2010/6/24 Barry Warsaw <barry at python.org>:
>>> Please let me know what you think. ?I'm happy to just commit this to the
>>> py3k branch if there are no objections <wink>. ?I don't think a new PEP is
>>> in order, but an update to PEP 3147 might make sense.
>>
>>How will this interact with PEP 384 if that is implemented?
> I'm trying to come up with something that will work immediately while PEP 384
> is being adopted.

But how will modules specify that they support multiple ABIs then?


-- 
Regards,
Benjamin

From brett at python.org  Thu Jun 24 20:11:07 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 24 Jun 2010 11:11:07 -0700
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <11597.1277401099@parc.com>
References: <11597.1277401099@parc.com>
Message-ID: <AANLkTil09NfdbqdhP1c1vJ08HNpbLqQIgs9iyoIxQsKP@mail.gmail.com>

On Thu, Jun 24, 2010 at 10:38, Bill Janssen <janssen at parc.com> wrote:
[SNIP]
> The language moratorium kind of makes this all theoretical, but building
> a String ABC still would be a good start, and presumably isn't forbidden
> by the moratorium.

Because a new ABC would go into the stdlib (I assume in collections or
string) the moratorium does not apply.

From guido at python.org  Thu Jun 24 20:27:37 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 Jun 2010 11:27:37 -0700
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
Message-ID: <AANLkTinIv28woxfcxRCi_tUNyn2VVuHwuzJFJ4OgKXFJ@mail.gmail.com>

On Thu, Jun 24, 2010 at 10:48 AM, Brett Cannon <brett at python.org> wrote:
> On Thu, Jun 24, 2010 at 08:50, Barry Warsaw <barry at python.org> wrote:
>> This is a follow up to PEP 3147. ?That PEP, already implemented in Python 3.2,
>> allows for Python source files from different Python versions to live together
>> in the same directory. ?It does this by putting a magic tag in the .pyc file
>> name and placing the .pyc file in a __pycache__ directory.
>>
>> Distros such as Debian and Ubuntu will use this to greatly simplifying
>> deploying Python, and Python applications and libraries. ?Debian and Ubuntu
>> usually ship more than one version of Python, and currently have to play
>> complex games with symlinks to make this work. ?PEP 3147 will go a long way to
>> eliminating the need for extra directories and symlinks.
>>
>> One more thing I've found we need though, is a way to handled shared libraries
>> for extension modules. ?Just as we can get name collisions on foo.pyc, we can
>> get collisions on foo.so. ?We obviously cannot install foo.so built for Python
>> 3.2 and foo.so built for Python 3.3 in the same location. ?So symlink
>> nightmare's mini-me is back.
>>
>> I have a fairly simple fix for this. ?I'd actually be surprised if this hasn't
>> been discussed before, but teh Googles hasn't turned up anything.
>>
>> The idea is to put the Python version number in the shared library file name,
>> and extend .so lookup to find these extended file names. ?So for example, we'd
>> see foo.3.2.so instead, and Python would know how to dynload both that and the
>> traditional foo.so file too (for backward compatibility).
>>
>> (On file naming: the original patch used foo.so.3.2 and that works just as
>> well, but I thought there might be tools that expect exactly a '.so' suffix,
>> so I changed it to put the Major.Minor version number to the left of the
>> extension. ?The exact naming scheme is of course open to debate.)
>>
>
> While the idea is fine with me since I won't have any of my
> directories cluttered with multiple .so files, I would still want to
> add some moniker showing that the version number represents the
> interpreter and not the .so file. If I read "foo.3.2.so", that naively
> seems to mean to mean the foo module's 3.2 release is what is in
> installed, not that it's built for CPython 3.2. So even though it
> might be redundant, I would still want the VM name added.

Well, for versions of the .so itself, traditionally version numbers
are appended *after* the .so suffix (check your /lib directory :-).

> Adding the VM name also doesn't make extension modules the exclusive
> domain of CPython either. If some other VM decides to make their own
> .so files that are not binary compatible then we should not preclude
> that as this solution it is nothing more than it makes a string
> comparison have to look at 7 more characters.
>
> -Brett
>
> P.S.: I wish we could drop use of the 'module.so' variant at the same
> time, for consistency sake and to cut out a stat call, but I know that
> is asking too much.

I wish so too. IIRC there used to be some modules that on Windows were
wrappers around 3rd party DLLs and you can't have foo.dll as the
module wrapping foo.dll the 3rd party DLL. (On Unix this problem
doesn't exist because the 3rd party .so would be named libfoo.so, not
foo.so.)

-- 
--Guido van Rossum (python.org/~guido)

From barry at python.org  Thu Jun 24 20:28:30 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 14:28:30 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTindH5uADbSwan-xWV08YcDaEKI3CleaFjhdmHvX@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>
	<20100624135119.00b9ac5c@heresy>
	<AANLkTindH5uADbSwan-xWV08YcDaEKI3CleaFjhdmHvX@mail.gmail.com>
Message-ID: <20100624142830.4c859faf@limelight.wooz.org>

On Jun 24, 2010, at 01:00 PM, Benjamin Peterson wrote:

>2010/6/24 Barry Warsaw <barry at python.org>:
>> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>>
>>>2010/6/24 Barry Warsaw <barry at python.org>:
>>>> Please let me know what you think. ?I'm happy to just commit this to the
>>>> py3k branch if there are no objections <wink>. ?I don't think a new PEP is
>>>> in order, but an update to PEP 3147 might make sense.
>>>
>>>How will this interact with PEP 384 if that is implemented?
>> I'm trying to come up with something that will work immediately while PEP 384
>> is being adopted.
>
>But how will modules specify that they support multiple ABIs then?

I didn't understand, so asked Benjamin for clarification in IRC.

<gutworth> barry: if python 3.3 will only load x.3.3.so, but x.3.2.so supports
           the stable abi, will it load it?  [14:25]
<barry> gutworth: thanks, now i get it :)  [14:26]
<barry> gutworth: i think it should, but it wouldn't under my scheme.  let me
        think about it

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/18ba0bdb/attachment.pgp>

From brett at python.org  Thu Jun 24 20:47:14 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 24 Jun 2010 11:47:14 -0700
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTinIv28woxfcxRCi_tUNyn2VVuHwuzJFJ4OgKXFJ@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com> 
	<AANLkTinIv28woxfcxRCi_tUNyn2VVuHwuzJFJ4OgKXFJ@mail.gmail.com>
Message-ID: <AANLkTinhVddAY0Xkc51mtjeuvYoIOJRiPNDD_ANI6TWC@mail.gmail.com>

On Thu, Jun 24, 2010 at 11:27, Guido van Rossum <guido at python.org> wrote:
> On Thu, Jun 24, 2010 at 10:48 AM, Brett Cannon <brett at python.org> wrote:
>> On Thu, Jun 24, 2010 at 08:50, Barry Warsaw <barry at python.org> wrote:
>>> This is a follow up to PEP 3147. ?That PEP, already implemented in Python 3.2,
>>> allows for Python source files from different Python versions to live together
>>> in the same directory. ?It does this by putting a magic tag in the .pyc file
>>> name and placing the .pyc file in a __pycache__ directory.
>>>
>>> Distros such as Debian and Ubuntu will use this to greatly simplifying
>>> deploying Python, and Python applications and libraries. ?Debian and Ubuntu
>>> usually ship more than one version of Python, and currently have to play
>>> complex games with symlinks to make this work. ?PEP 3147 will go a long way to
>>> eliminating the need for extra directories and symlinks.
>>>
>>> One more thing I've found we need though, is a way to handled shared libraries
>>> for extension modules. ?Just as we can get name collisions on foo.pyc, we can
>>> get collisions on foo.so. ?We obviously cannot install foo.so built for Python
>>> 3.2 and foo.so built for Python 3.3 in the same location. ?So symlink
>>> nightmare's mini-me is back.
>>>
>>> I have a fairly simple fix for this. ?I'd actually be surprised if this hasn't
>>> been discussed before, but teh Googles hasn't turned up anything.
>>>
>>> The idea is to put the Python version number in the shared library file name,
>>> and extend .so lookup to find these extended file names. ?So for example, we'd
>>> see foo.3.2.so instead, and Python would know how to dynload both that and the
>>> traditional foo.so file too (for backward compatibility).
>>>
>>> (On file naming: the original patch used foo.so.3.2 and that works just as
>>> well, but I thought there might be tools that expect exactly a '.so' suffix,
>>> so I changed it to put the Major.Minor version number to the left of the
>>> extension. ?The exact naming scheme is of course open to debate.)
>>>
>>
>> While the idea is fine with me since I won't have any of my
>> directories cluttered with multiple .so files, I would still want to
>> add some moniker showing that the version number represents the
>> interpreter and not the .so file. If I read "foo.3.2.so", that naively
>> seems to mean to mean the foo module's 3.2 release is what is in
>> installed, not that it's built for CPython 3.2. So even though it
>> might be redundant, I would still want the VM name added.
>
> Well, for versions of the .so itself, traditionally version numbers
> are appended *after* the .so suffix (check your /lib directory :-).
>

Second thing you taught me today (first was the x[:0] trick)!

I've also been on OS X too long; /usr/lib is just .dynalib and that
puts the version number before the extension.

>> Adding the VM name also doesn't make extension modules the exclusive
>> domain of CPython either. If some other VM decides to make their own
>> .so files that are not binary compatible then we should not preclude
>> that as this solution it is nothing more than it makes a string
>> comparison have to look at 7 more characters.
>>
>> -Brett
>>
>> P.S.: I wish we could drop use of the 'module.so' variant at the same
>> time, for consistency sake and to cut out a stat call, but I know that
>> is asking too much.
>
> I wish so too. IIRC there used to be some modules that on Windows were
> wrappers around 3rd party DLLs and you can't have foo.dll as the
> module wrapping foo.dll the 3rd party DLL. (On Unix this problem
> doesn't exist because the 3rd party .so would be named libfoo.so, not
> foo.so.)

Wouldn't Barry's proposed solution actually fill this need since it
will give the file a custom Python suffix that more-or-less guarantees
no name clash with a third-party DLL?

From merwok at netwok.org  Thu Jun 24 20:50:41 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 24 Jun 2010 20:50:41 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624115048.4fd152e3@heresy>
References: <20100624115048.4fd152e3@heresy>
Message-ID: <4C23A901.7060100@netwok.org>

Le 24/06/2010 17:50, Barry Warsaw (FLUFL) a ?crit :
> Other possible approaches:
>  * Extend the distutils API so that the .so file extension can be passed in,
>    instead of being essentially hardcoded to what Python's Makefile contains.

Third-party code rely on Distutils internal quirks, so it?s frozen. Feel
free to open a bug against Distutils2 on the Python tracker if that
would be generally useful.

Regards


From merwok at netwok.org  Thu Jun 24 20:53:02 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 24 Jun 2010 20:53:02 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
Message-ID: <4C23A98E.4080303@netwok.org>

Le 24/06/2010 19:48, Brett Cannon a ?crit :
> P.S.: I wish we could drop use of the 'module.so' variant at the same
> time, for consistency sake and to cut out a stat call, but I know that
> is asking too much.

At least, looking for spam/__init__module.so could be avoided. It seems
to me that the package definition does not allow that. The tradeoff
would be code complication for one less stat call. Worth a bug report?

Regards


From fuzzyman at voidspace.org.uk  Thu Jun 24 21:07:41 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Thu, 24 Jun 2010 20:07:41 +0100
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTil09NfdbqdhP1c1vJ08HNpbLqQIgs9iyoIxQsKP@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTil09NfdbqdhP1c1vJ08HNpbLqQIgs9iyoIxQsKP@mail.gmail.com>
Message-ID: <4C23ACFD.6040506@voidspace.org.uk>

On 24/06/2010 19:11, Brett Cannon wrote:
> On Thu, Jun 24, 2010 at 10:38, Bill Janssen<janssen at parc.com>  wrote:
> [SNIP]
>    
>> The language moratorium kind of makes this all theoretical, but building
>> a String ABC still would be a good start, and presumably isn't forbidden
>> by the moratorium.
>>      
> Because a new ABC would go into the stdlib (I assume in collections or
> string) the moratorium does not apply.
>    

Although it would require changes for builtin types like file to work 
with a new string ABC, right?

Michael

> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From brett at python.org  Thu Jun 24 21:10:38 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 24 Jun 2010 12:10:38 -0700
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <4C23ACFD.6040506@voidspace.org.uk>
References: <11597.1277401099@parc.com>
	<AANLkTil09NfdbqdhP1c1vJ08HNpbLqQIgs9iyoIxQsKP@mail.gmail.com> 
	<4C23ACFD.6040506@voidspace.org.uk>
Message-ID: <AANLkTik4y2P-eB2t10Z6p5rW8ZXNo2y5ggnSqdWOF6Zk@mail.gmail.com>

On Thu, Jun 24, 2010 at 12:07, Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> On 24/06/2010 19:11, Brett Cannon wrote:
>>
>> On Thu, Jun 24, 2010 at 10:38, Bill Janssen<janssen at parc.com> ?wrote:
>> [SNIP]
>>
>>>
>>> The language moratorium kind of makes this all theoretical, but building
>>> a String ABC still would be a good start, and presumably isn't forbidden
>>> by the moratorium.
>>>
>>
>> Because a new ABC would go into the stdlib (I assume in collections or
>> string) the moratorium does not apply.
>>
>
> Although it would require changes for builtin types like file to work with a
> new string ABC, right?

Only if they wanted to rely on some concrete implementation of a
method contained within the ABC. Otherwise that's what abc.register
exists for.

From ianb at colorstudy.com  Thu Jun 24 21:49:33 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 24 Jun 2010 14:49:33 -0500
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <11597.1277401099@parc.com>
References: <11597.1277401099@parc.com>
Message-ID: <AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>

On Thu, Jun 24, 2010 at 12:38 PM, Bill Janssen <janssen at parc.com> wrote:

> Here are a couple of ideas I'm taking away from the bytes/string
> discussion.
>
> First, it would probably be a good idea to have a String ABC.
>
> Secondly, maybe the string situation in 2.x wasn't as broken as we
> thought it was.  In particular, those who deal with lots of encoded
> strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
> are more like numbers than we think.  We have separate types for int,
> float, Decimal, etc.  But they're all numbers, and they all
> cross-operate.  In 2.x, it seems there were two missing features: no
> encoding attribute on str, which should have been there and should have
> been required, and the default encoding being "ASCII" (I can't tell you
> how many times I've had to fix that issue when a non-ASCII encoded str
> was passed to some output function).
>

I've started to form a conceptual notion that I think fits these cases.

We've setup a system where we think of text as natively unicode, with
encodings to put that unicode into a byte form.  This is certainly
appropriate in a lot of cases.  But there's a significant class of problems
where bytes are the native structure.  Network protocols are what we've been
discussing, and are a notable case of that.  That is, b'/' is the most
native sense of a path separator in a URL, or b':' is the most native sense
of what separates a header name from a header value in HTTP.  To disallow
unicode URLs or unicode HTTP headers would be rather anti-social, especially
because unicode is now the "native" string type in Python 3 (as an aside for
the WSGI spec we've been talking about using "native" strings in some
positions like dictionary keys, meaning Python 2 str and Python 3 str, while
being more exacting in other areas such as a response body which would
always be bytes).

The HTTP spec and other network protocols seems a little fuzzy on this,
because it was written before unicode even existed, and even later activity
happened at a point when "unicode" and "text" weren't widely considered the
same thing like they are now.  But I think the original intention is
revealed in a more modern specification like WebSockets, where they are very
explicit that ':' is just shorthand for a particular byte, it is not "text"
in our new modern notion of the term.

So with this idea in mind it makes more sense to me that *specific pieces of
text* can be reasonably treated as both bytes and text.  All the string
literals in urllib.parse.urlunspit() for example.

The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not
become special('/x')) and special('/')+x=='/x' (again it becomes str).  This
avoids some of the cases of unicode or str infecting a system as they did in
Python 2 (where you might pass in unicode and everything works fine until
some non-ASCII is introduced).

The one place where this might be tricky is if you have an encoding that is
not ASCII compatible.  But we can't guard against every possibility.  So it
would be entirely wrong to take a string encoded with UTF-16 and start to
use b'/' with it.  But there are other nonsensical combinations already
possible, especially with polymorphic functions, we can't guard against all
of them.  Also I'm unsure if something like UTF-16 is in any way compatible
with the kind of legacy systems that use bytes.  Can you encode your
filesystem with UTF-16?  I don't think you could encode a cookie with it.

So maybe having a second string type in 3.x that consists of an encoded
> sequence of bytes plus the encoding, call it "estr", wouldn't have been
> a bad idea.  It would probably have made sense to have estr cooperate
> with the str type, in the same way that two different kinds of numbers
> cooperate, "promoting" the result of an operation only when necessary.
> This would automatically achieve the kind of polymorphic functionality
> that Guido is suggesting, but without losing the ability to do
>
>  x = e(ASCII)"bar"
>  a = ''.join("foo", x)
>
> (or whatever the syntax for such an encoded string literal would be --
> I'm not claiming this is a good one) which presume would bind "a" to a
> Unicode string "foobar" -- have to work out what gets promoted to what.
>

I would be entirely happy without a literal syntax.  But as Phillip has
noted, this can't be implemented *entirely* in a library as there are some
constraints with the current str/bytes implementations.  Reading PEP 3003
I'm not clear if such changes are part of the moratorium?  They seem like
they would be (sadly), but it doesn't seem clearly noted.

I think there's a *different* use case for things like
bytes-in-a-utf8-encoding (e.g., to allow XML data to be decoded lazily), but
that could be yet another class, and maybe shouldn't be polymorphicly usable
as bytes (i.e., treat it as an optimized str representation that is
otherwise semantically equivalent).  A String ABC would formalize these
things.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/caa06422/attachment.html>

From barry at python.org  Thu Jun 24 22:46:37 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 16:46:37 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624142830.4c859faf@limelight.wooz.org>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>
	<20100624135119.00b9ac5c@heresy>
	<AANLkTindH5uADbSwan-xWV08YcDaEKI3CleaFjhdmHvX@mail.gmail.com>
	<20100624142830.4c859faf@limelight.wooz.org>
Message-ID: <20100624164637.22fd9160@heresy>

On Jun 24, 2010, at 02:28 PM, Barry Warsaw wrote:

>On Jun 24, 2010, at 01:00 PM, Benjamin Peterson wrote:
>
>>2010/6/24 Barry Warsaw <barry at python.org>:
>>> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>>>
>>>>2010/6/24 Barry Warsaw <barry at python.org>:
>>>>> Please let me know what you think. ?I'm happy to just commit this to the
>>>>> py3k branch if there are no objections <wink>. ?I don't think a new PEP is
>>>>> in order, but an update to PEP 3147 might make sense.
>>>>
>>>>How will this interact with PEP 384 if that is implemented?
>>> I'm trying to come up with something that will work immediately while PEP 384
>>> is being adopted.
>>
>>But how will modules specify that they support multiple ABIs then?
>
>I didn't understand, so asked Benjamin for clarification in IRC.
>
><gutworth> barry: if python 3.3 will only load x.3.3.so, but x.3.2.so supports
>           the stable abi, will it load it?  [14:25]
><barry> gutworth: thanks, now i get it :)  [14:26]
><barry> gutworth: i think it should, but it wouldn't under my scheme.  let me
>        think about it

So, we could say that PEP 384 compliant extension modules would get written
without a version specifier.  IOW, we'd treat foo.so as using the ABI.  It
would then be up to the Python runtime to throw ImportErrors if in fact we
were loading a legacy, non-PEP 384 compliant extension.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/87c1061a/attachment.pgp>

From barry at python.org  Thu Jun 24 22:53:36 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 16:53:36 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
Message-ID: <20100624165336.27fc7cc9@heresy>

On Jun 24, 2010, at 10:48 AM, Brett Cannon wrote:

>While the idea is fine with me since I won't have any of my
>directories cluttered with multiple .so files, I would still want to
>add some moniker showing that the version number represents the
>interpreter and not the .so file. If I read "foo.3.2.so", that naively
>seems to mean to mean the foo module's 3.2 release is what is in
>installed, not that it's built for CPython 3.2. So even though it
>might be redundant, I would still want the VM name added.

I have a new version of my patch that steals the "magic tag" idea from PEP
3147.  Note that it does not use the *actual* same piece of information to
compose the file name, but for now it does match the pyc tag string.

E.g.

% find . -name \*.so
./build/lib.linux-x86_64-3.2/math.cpython-32.so
./build/lib.linux-x86_64-3.2/select.cpython-32.so
./build/lib.linux-x86_64-3.2/_struct.cpython-32.so
...

Further, by default, ./configure doesn't add this tag so that you would have
to build Python with:

% SOABI=cpython-32 ./configure

to get anything between the module name and the extension.  I could of course
make this a configure switch instead, and could default it to some other magic
string instead of the empty string.

>Adding the VM name also doesn't make extension modules the exclusive
>domain of CPython either. If some other VM decides to make their own
>.so files that are not binary compatible then we should not preclude
>that as this solution it is nothing more than it makes a string
>comparison have to look at 7 more characters.
>
>-Brett
>
>P.S.: I wish we could drop use of the 'module.so' variant at the same
>time, for consistency sake and to cut out a stat call, but I know that
>is asking too much.

I think you're right that with the $SOABI trick above, you wouldn't get the
name collisions Guido recalls, and you could get rid of module.so.  OTOH, as I
am currently only targeting Linux, it seems like the module.so stat is wasted
anyway on that platform.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/e2db0796/attachment.pgp>

From barry at python.org  Thu Jun 24 22:55:33 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 16:55:33 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTinIv28woxfcxRCi_tUNyn2VVuHwuzJFJ4OgKXFJ@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com>
	<AANLkTinIv28woxfcxRCi_tUNyn2VVuHwuzJFJ4OgKXFJ@mail.gmail.com>
Message-ID: <20100624165533.46a5fb8e@heresy>

On Jun 24, 2010, at 11:27 AM, Guido van Rossum wrote:

>On Thu, Jun 24, 2010 at 10:48 AM, Brett Cannon <brett at python.org> wrote:
>> While the idea is fine with me since I won't have any of my
>> directories cluttered with multiple .so files, I would still want to
>> add some moniker showing that the version number represents the
>> interpreter and not the .so file. If I read "foo.3.2.so", that naively
>> seems to mean to mean the foo module's 3.2 release is what is in
>> installed, not that it's built for CPython 3.2. So even though it
>> might be redundant, I would still want the VM name added.
>
>Well, for versions of the .so itself, traditionally version numbers
>are appended *after* the .so suffix (check your /lib directory :-).

Which is probably another reason not to use foo.so.X.Y for Python extension
modules.  I think it would be confusing, and foo.<tag>.so looks nice and is
consistent with foo.<tag>.pyc.  (Ref to updated patch coming...)

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/fb2b7ea5/attachment.pgp>

From guido at python.org  Thu Jun 24 22:59:09 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 Jun 2010 13:59:09 -0700
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
Message-ID: <AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>

I see it a little differently (though there is probably a common
concept lurking in here).

The protocols you mention are intentionally designed to be
encoding-neutral as long as the encoding is an ASCII superset. This
covers ASCII itself, Latin-1, Latin-N for other values of N, MacRoman,
Microsoft's code pages (most of them anyways), UTF-8, presumably at
least some of the Japanese encodings, and probably a host of others.
But it does not cover UTF-16, EBCDIC, and others. (Encodings that have
"shift bytes" that change the meaning of some or all ordinary ASCII
characters also aren't covered, unless such an encoding happens to
exclude the special characters that the protocol spec cares about).

The protocol specs typically go out of their way to specify what byte
values they use for syntactically significant positions (e.g. ':' in
headers, or '/' in URLs), while hand-waving about the meaning of "what
goes in between" since it is all typically treated as "not of
syntactic significance". So you can write a parser that looks at bytes
exclusively, and looks for a bunch of ASCII punctuation characters
(e.g. '<', '>', '/', '&'), and doesn't know or care whether the stuff
in between is encoded in Latin-15, MacRoman or UTF-8 -- it never looks
"inside" stretches of characters between the special characters and
just copies them. (Sometimes there may be *some* sections that are
required to be ASCII and there equivalence of a-z and A-Z is well
defined.)

But I wouldn't go so far as to claim that interpreting the protocols
as text is wrong. After all we're talking exclusively about protocols
that are designed intentionally to be directly "human readable"
(albeit as a fall-back option) -- the only tool you need to debug the
traffic on the wire or socket is something that knows which subset of
ASCII is considered "printable" and which renders everything else
safely as a hex escape or even a special "unknown" character (like
Unicode's "?" inside a black diamond).

Depending on the requirements of a specific app (or framework) it may
be entirely reasonable to convert everything to Unicode and process
the resulting text; in other contexts it makes more sense to keep
everything as bytes. It also makes sense to have an interface library
to deal with a specific protocol that treats the protocol side as
bytes but interacts with the application using text, since that is
often how the application programmer wants to treat it anyway.

Of course, some protocols require the application programmer to be
aware of bytes as well in *some* cases -- examples are email and HTTP
which can be used to transfer text as well as binary data (e.g.
images). There is also the bootstrap problem where the wire data must
be partially parsed in order to find out the encoding to be used to
convert it to text. But that doesn't mean it's invalid to think about
it as text in many application contexts.

Regarding the proposal of a String ABC, I hope this isn't going to
become a backdoor to reintroduce the Python 2 madness of allowing
equivalency between text and bytes for *some* strings of bytes and not
others.

Finally, I do think that we should not introduce changes to the
fundamental behavior of text and bytes while the moratorium is in
place. Changes to specific stdlib APIs are fine however.

--Guido

On Thu, Jun 24, 2010 at 12:49 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> On Thu, Jun 24, 2010 at 12:38 PM, Bill Janssen <janssen at parc.com> wrote:
>>
>> Here are a couple of ideas I'm taking away from the bytes/string
>> discussion.
>>
>> First, it would probably be a good idea to have a String ABC.
>>
>> Secondly, maybe the string situation in 2.x wasn't as broken as we
>> thought it was. ?In particular, those who deal with lots of encoded
>> strings seemed to find it handy, and miss it in 3.x. ?Perhaps strings
>> are more like numbers than we think. ?We have separate types for int,
>> float, Decimal, etc. ?But they're all numbers, and they all
>> cross-operate. ?In 2.x, it seems there were two missing features: no
>> encoding attribute on str, which should have been there and should have
>> been required, and the default encoding being "ASCII" (I can't tell you
>> how many times I've had to fix that issue when a non-ASCII encoded str
>> was passed to some output function).
>
> I've started to form a conceptual notion that I think fits these cases.
>
> We've setup a system where we think of text as natively unicode, with
> encodings to put that unicode into a byte form.? This is certainly
> appropriate in a lot of cases.? But there's a significant class of problems
> where bytes are the native structure.? Network protocols are what we've been
> discussing, and are a notable case of that.? That is, b'/' is the most
> native sense of a path separator in a URL, or b':' is the most native sense
> of what separates a header name from a header value in HTTP.? To disallow
> unicode URLs or unicode HTTP headers would be rather anti-social, especially
> because unicode is now the "native" string type in Python 3 (as an aside for
> the WSGI spec we've been talking about using "native" strings in some
> positions like dictionary keys, meaning Python 2 str and Python 3 str, while
> being more exacting in other areas such as a response body which would
> always be bytes).
>
> The HTTP spec and other network protocols seems a little fuzzy on this,
> because it was written before unicode even existed, and even later activity
> happened at a point when "unicode" and "text" weren't widely considered the
> same thing like they are now.? But I think the original intention is
> revealed in a more modern specification like WebSockets, where they are very
> explicit that ':' is just shorthand for a particular byte, it is not "text"
> in our new modern notion of the term.
>
> So with this idea in mind it makes more sense to me that *specific pieces of
> text* can be reasonably treated as both bytes and text.? All the string
> literals in urllib.parse.urlunspit() for example.
>
> The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not
> become special('/x')) and special('/')+x=='/x' (again it becomes str).? This
> avoids some of the cases of unicode or str infecting a system as they did in
> Python 2 (where you might pass in unicode and everything works fine until
> some non-ASCII is introduced).
>
> The one place where this might be tricky is if you have an encoding that is
> not ASCII compatible.? But we can't guard against every possibility.? So it
> would be entirely wrong to take a string encoded with UTF-16 and start to
> use b'/' with it.? But there are other nonsensical combinations already
> possible, especially with polymorphic functions, we can't guard against all
> of them.? Also I'm unsure if something like UTF-16 is in any way compatible
> with the kind of legacy systems that use bytes.? Can you encode your
> filesystem with UTF-16?? I don't think you could encode a cookie with it.
>
>> So maybe having a second string type in 3.x that consists of an encoded
>> sequence of bytes plus the encoding, call it "estr", wouldn't have been
>> a bad idea. ?It would probably have made sense to have estr cooperate
>> with the str type, in the same way that two different kinds of numbers
>> cooperate, "promoting" the result of an operation only when necessary.
>> This would automatically achieve the kind of polymorphic functionality
>> that Guido is suggesting, but without losing the ability to do
>>
>> ?x = e(ASCII)"bar"
>> ?a = ''.join("foo", x)
>>
>> (or whatever the syntax for such an encoded string literal would be --
>> I'm not claiming this is a good one) which presume would bind "a" to a
>> Unicode string "foobar" -- have to work out what gets promoted to what.
>
> I would be entirely happy without a literal syntax.? But as Phillip has
> noted, this can't be implemented *entirely* in a library as there are some
> constraints with the current str/bytes implementations.? Reading PEP 3003
> I'm not clear if such changes are part of the moratorium?? They seem like
> they would be (sadly), but it doesn't seem clearly noted.
>
> I think there's a *different* use case for things like
> bytes-in-a-utf8-encoding (e.g., to allow XML data to be decoded lazily), but
> that could be yet another class, and maybe shouldn't be polymorphicly usable
> as bytes (i.e., treat it as an optimized str representation that is
> otherwise semantically equivalent).? A String ABC would formalize these
> things.
>
> --
> Ian Bicking ?| ?http://blog.ianbicking.org
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe:
> http://mail.python.org/mailman/options/python-dev/guido%40python.org
>
>


-- 
--Guido van Rossum (python.org/~guido)

From brett at python.org  Thu Jun 24 23:08:14 2010
From: brett at python.org (Brett Cannon)
Date: Thu, 24 Jun 2010 14:08:14 -0700
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23A98E.4080303@netwok.org>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com> 
	<4C23A98E.4080303@netwok.org>
Message-ID: <AANLkTimrysVnyvvUchmbETYNDVx3ELoQDgo04_Z3Cdtg@mail.gmail.com>

On Thu, Jun 24, 2010 at 11:53, ?ric Araujo <merwok at netwok.org> wrote:
> Le 24/06/2010 19:48, Brett Cannon a ?crit :
>> P.S.: I wish we could drop use of the 'module.so' variant at the same
>> time, for consistency sake and to cut out a stat call, but I know that
>> is asking too much.
>
> At least, looking for spam/__init__module.so could be avoided. It seems
> to me that the package definition does not allow that.

I thought no one had bothered to change import.c to allow for
extension modules to act as a package's __init__?

As for not being allowed, I don't agree with that assessment. If you
treat a package's __init__ module as simply that, a module that would
be named __init__ when imported, then __init__module.c would be valid
(and that's what importlib does).

> The tradeoff
> would be code complication for one less stat call. Worth a bug report?

Nah.

From barry at python.org  Thu Jun 24 23:09:44 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 17:09:44 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
Message-ID: <20100624170944.7e68ad21@heresy>

On Jun 24, 2010, at 11:05 AM, Daniel Stutzbach wrote:

>On Thu, Jun 24, 2010 at 10:50 AM, Barry Warsaw <barry at python.org> wrote:
>
>> The idea is to put the Python version number in the shared library file
>> name,
>> and extend .so lookup to find these extended file names.  So for example,
>> we'd
>> see foo.3.2.so instead, and Python would know how to dynload both that and
>> the
>> traditional foo.so file too (for backward compatibility).
>>
>
>What use case does this address?

Specifically, it's the use case where we (Debian/Ubuntu) plan on installing
all Python 3.x packages into /usr/lib/python3/dist-packages.  As of PEP 3147,
we can do that without collisions on the pyc files, but would still have to
symlink for extension module .so files, because they are always named foo.so
and Python 3.2's foo.so won't (modulo PEP 384) be compatible with Python 3.3's
foo.so.

So using the same trick as in PEP 3147, if we can name Python 3.2's foo
extension differently than the incompatible Python 3.3's foo extension, we can
have them live in the same directory without symlink tricks.

>PEP 3147 addresses the fact that the user may have different versions of
>Python installed and each wants to write a .pyc file when loading a module.
> .so files are not generated simply by running the Python interpreter, ergo
>.so files are not an issue for that use case.

See above.  It doesn't matter whether the pyc or so is created at run time by
the user or by the distro build system.  If the files for different Python
versions end up in the same directory, they must be named differently too.

>If you want to make it so a system can install a package in just one
>location to be used by multiple Python installations, then the version
>number isn't enough.  You also need to distinguish debug builds, profiling
>builds, Unicode width (see issue8654), and probably several other
>./configure options.

This is a good point, but more easily addressed.  Let's say a distro makes
three Python 3.2 variants available, one "normal" build, a debug build, and
UCS2 and USC4 versions of the above.  All we need to do is choose a different
.so ABI tag (see previous follow) for each of those builds.  My updated patch
(coming soon) allows you to define that tag to configure.  So e.g.

Normal build UCSX: SOABI=cpython-32 ./configure
Debug build UCSX:  SOABI=cpython-32-d ./configure
Normal build UCSY: SOABI=cpython-32-w ./configure
Debug build UCSY:  SOABI=cpython-32-dw ./configure

Mix and match for any other build options you care about.  Because the distro
controls how Python is configured, this should be fairly easy to achieve.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/8d6f979e/attachment.pgp>

From fdrake at acm.org  Thu Jun 24 23:12:21 2010
From: fdrake at acm.org (Fred Drake)
Date: Thu, 24 Jun 2010 17:12:21 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624165533.46a5fb8e@heresy>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikPDJ51F9KbGMtrR_58wBywHWVhe80GMATWwSXs@mail.gmail.com> 
	<AANLkTinIv28woxfcxRCi_tUNyn2VVuHwuzJFJ4OgKXFJ@mail.gmail.com> 
	<20100624165533.46a5fb8e@heresy>
Message-ID: <AANLkTin3GOgRLXujPgBQ9olby-t9FsJslMieDfCAojmh@mail.gmail.com>

On Thu, Jun 24, 2010 at 4:55 PM, Barry Warsaw <barry at python.org> wrote:
> Which is probably another reason not to use foo.so.X.Y for Python extension
> modules.

Clearly, foo.so.3.2 is the man page for the foo.so.3 system call.

The ABI ident definitely has to be elsewhere.


  -Fred

-- 
Fred L. Drake, Jr.    <fdrake at gmail.com>
"A storm broke loose in my mind."  --Albert Einstein

From barry at python.org  Thu Jun 24 23:23:02 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 17:23:02 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23A901.7060100@netwok.org>
References: <20100624115048.4fd152e3@heresy>
	<4C23A901.7060100@netwok.org>
Message-ID: <20100624172302.024687ef@heresy>

On Jun 24, 2010, at 08:50 PM, ?ric Araujo wrote:

>Le 24/06/2010 17:50, Barry Warsaw (FLUFL) a ?crit :
>> Other possible approaches:
>>  * Extend the distutils API so that the .so file extension can be passed in,
>>    instead of being essentially hardcoded to what Python's Makefile contains.
>
>Third-party code rely on Distutils internal quirks, so it?s frozen. Feel
>free to open a bug against Distutils2 on the Python tracker if that
>would be generally useful.

Depending on how strict this constraint is, it could make things more
difficult.  I can control what shared library file names Python will load
statically, but in order to support PEP 384 I think I need to be able to
control what file extensions build_ext writes.

My updated patch does this in a backward compatible way.  Of course, distutils
hacks have their tentacles all up in the distutils internals, so maybe my
patch will break something after all.  I can think of a few even hackier ways
to work around that if necessary.

My updated patch:
 * Adds an optional argument to build_ext.get_ext_fullpath() and
   build_ext.get_ext_filename().  This extra argument is the Extension
   instance being built.  (Boy, just in case anyone's already playing with the
   time machine, it sure would have been nice if these methods had originally
   just taken the Extension instance and dug out ext.name instead of passing
   the string in.)
 * Adds an optional new keyword argument to the Extension class, called
   so_abi_tag.  If given, this overrides the Makefile $SO variable extension.

What this means is that with no changes, a non-PEP 384 compliant extension
module wouldn't have to change anything:

setup(
    name='stupid',
    version='0.0',
    packages=['stupid', 'stupid.tests'],
    ext_modules=[Extension('_stupid',
                           ['src/stupid.c'],
                           )],
    test_suite='stupid.tests',
    )

With a Python built like so:

    % SOABI=cpython-32 ./configure

you'd end up with a _stupid.cpython-32.so module.

However, if you knew your extension module was PEP 384 compliant, and could be
shared on >=Python 3.2, you would do:

setup(
    name='stupid',
    version='0.0',
    packages=['stupid', 'stupid.tests'],
    ext_modules=[Extension('_stupid',
                           ['src/stupid.c'],
                           so_abi_tag='',
                           )],
    test_suite='stupid.tests',
    )

and now you'd end up with _stupid.so, which I propose to mean it's PEP 384 ABI
compliant.  (There may not be any other use case than so_abi_tag='' or
so_abi_tag=None, in which case, the Extension keyword *might* be better off as
a boolean.)

Now of course PEP 384 isn't implemented, so it's a bit of a moot point.  But
if some form of versioned .so file naming is accepted for Python 3.2, I'll
update PEP 384 with possible solutions.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/14b9b0be/attachment.pgp>

From barry at python.org  Thu Jun 24 23:27:00 2010
From: barry at python.org (Barry Warsaw)
Date: Thu, 24 Jun 2010 17:27:00 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624115048.4fd152e3@heresy>
References: <20100624115048.4fd152e3@heresy>
Message-ID: <20100624172700.0b837222@heresy>

On Jun 24, 2010, at 11:50 AM, Barry Warsaw wrote:

>Please let me know what you think.  I'm happy to just commit this to the py3k
>branch if there are no objections <wink>.  I don't think a new PEP is in
>order, but an update to PEP 3147 might make sense.

Thanks for all the quick feedback.  I've made some changes based on the
comments so far.  The bzr branch is updated, and a new patch is available
here:

http://pastebin.ubuntu.com/454688/

If reception continues to be mildly approving, I'll open an issue on
bugs.python.org and attach the patch to that.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/3b4b5921/attachment.pgp>

From merwok at netwok.org  Thu Jun 24 23:37:10 2010
From: merwok at netwok.org (=?UTF-8?B?w4lyaWMgQXJhdWpv?=)
Date: Thu, 24 Jun 2010 23:37:10 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624172302.024687ef@heresy>
References: <20100624115048.4fd152e3@heresy>	<4C23A901.7060100@netwok.org>
	<20100624172302.024687ef@heresy>
Message-ID: <4C23D006.6080800@netwok.org>

Your plan seems good. Adding keyword arguments should not create
compatibility issues, and I suspect the impact on the code of build_ext
may be actually quite small. I?ll try to review your patch even though I
don?t know C or compiler oddities, but Tarek will have the best insight
and the final word.

In case the time machine?s not available, your suggestion about getting
the filename from the Extension instance instead of passing in a string
can most certainly land in distutils2.

Regards


From ianb at colorstudy.com  Thu Jun 24 23:44:12 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Thu, 24 Jun 2010 16:44:12 -0500
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com> 
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
Message-ID: <AANLkTikI09E7-5_9ZOfhRr4i2UIVJ3IpHK7HqMiaqmV4@mail.gmail.com>

On Thu, Jun 24, 2010 at 3:59 PM, Guido van Rossum <guido at python.org> wrote:

> The protocol specs typically go out of their way to specify what byte
> values they use for syntactically significant positions (e.g. ':' in
> headers, or '/' in URLs), while hand-waving about the meaning of "what
> goes in between" since it is all typically treated as "not of
> syntactic significance". So you can write a parser that looks at bytes
> exclusively, and looks for a bunch of ASCII punctuation characters
> (e.g. '<', '>', '/', '&'), and doesn't know or care whether the stuff
> in between is encoded in Latin-15, MacRoman or UTF-8 -- it never looks
> "inside" stretches of characters between the special characters and
> just copies them. (Sometimes there may be *some* sections that are
> required to be ASCII and there equivalence of a-z and A-Z is well
> defined.)
>

Yes, these are the specific characters that I think we can handle
specially.  For instance, the list of all string literals used by urlsplit
and urlunsplit:
'//'
'/'
':'
'?'
'#'
''
'http'
A list of all valid scheme characters (a-z etc)
Some lists for scheme-specific parsing (which all contain valid scheme
characters)

All of these are constrained to ASCII, and must be constrained to ASCII, and
everything else in a URL is treated as basically opaque.

So if we turned these characters into byte-or-str objects I think we'd
basically be true to the intent of the specs, and in a practical sense we'd
be able to make these functions polymorphic.  I suspect this same pattern
will be present most places where people want polymorphic behavior.

For now we could do something incomplete and just avoid using operators we
can't overload (is it possible to at least make them produce a readable
exception?)

I think we'll avoid a lot of the confusion that was present with Python 2 by
not making the coercions transitive.  For instance, here's something that
would work in Python 2:

  urlunsplit(('http', 'example.com', '/foo', u'bar=baz', ''))

And you'd get out a unicode string, except that would break the first time
that query string (u'bar=baz') was not ASCII (but not until then!)

Here's the urlunsplit code:

def urlunsplit(components):
    scheme, netloc, url, query, fragment = components
    if netloc or (scheme and scheme in uses_netloc and url[:2] != '//'):
        if url and url[:1] != '/': url = '/' + url
        url = '//' + (netloc or '') + url
    if scheme:
        url = scheme + ':' + url
    if query:
        url = url + '?' + query
    if fragment:
        url = url + '#' + fragment
    return url

If all those literals were this new special kind of string, if you call:

  urlunsplit((b'http', b'example.com', b'/foo', 'bar=baz', b''))

You'd end up constructing the URL b'http://example.com/foo' and then
running:

    url = url + special('?') + query

And that would fail because b'http://example.com/foo' + special('?') would
be b'http://example.com/foo?' and you cannot add that to the str 'bar=baz'.
So we'd be avoiding the Python 2 craziness.

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100624/6d2089a5/attachment.html>

From solipsis at pitrou.net  Thu Jun 24 23:50:56 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Thu, 24 Jun 2010 23:50:56 +0200
Subject: [Python-Dev] thoughts on the bytes/string discussion
References: <11597.1277401099@parc.com>
	<AANLkTil09NfdbqdhP1c1vJ08HNpbLqQIgs9iyoIxQsKP@mail.gmail.com>
	<4C23ACFD.6040506@voidspace.org.uk>
Message-ID: <20100624235056.5a9930e6@pitrou.net>

On Thu, 24 Jun 2010 20:07:41 +0100
Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> 
> Although it would require changes for builtin types like file to work 
> with a new string ABC, right?

There is no builtin file type in 3.x.
Besides, it is not an ABC-level problem; the IO layer is written in C
(although there's still the Python implementation to play with), which
would mandate an abstract C API to access unicode-like objects
(similarly as there's already the buffer API to access bytes-like
objects).

Regards

Antoine.


From scott+python-dev at scottdial.com  Thu Jun 24 23:53:06 2010
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Thu, 24 Jun 2010 17:53:06 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624170944.7e68ad21@heresy>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy>
Message-ID: <4C23D3C2.1060500@scottdial.com>

On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>> What use case does this address?
> 
> Specifically, it's the use case where we (Debian/Ubuntu) plan on installing
> all Python 3.x packages into /usr/lib/python3/dist-packages.  As of PEP 3147,
> we can do that without collisions on the pyc files, but would still have to
> symlink for extension module .so files, because they are always named foo.so
> and Python 3.2's foo.so won't (modulo PEP 384) be compatible with Python 3.3's
> foo.so.

If the package has .so files that aren't compatible with other version
of python, then what is the motivation for placing that in a shared
location (since it can't actually be shared)?

> So using the same trick as in PEP 3147, if we can name Python 3.2's foo
> extension differently than the incompatible Python 3.3's foo extension, we can
> have them live in the same directory without symlink tricks.

Why would a symlink trick even be necessary if there is a
version-unspecific directory and a version-specific directory on the
search path?

>> PEP 3147 addresses the fact that the user may have different versions of
>> Python installed and each wants to write a .pyc file when loading a module.
>> .so files are not generated simply by running the Python interpreter, ergo
>> .so files are not an issue for that use case.
> 
> See above.  It doesn't matter whether the pyc or so is created at run time by
> the user or by the distro build system.  If the files for different Python
> versions end up in the same directory, they must be named differently too.

But the only motivation for doing this with .pyc files is that the .py
files are able to be shared, since the .pyc is an on-demand-generated,
version-specific artifact (and not the source). The .so file is created
offline by another toolchain, is version-specific, and presumably you
are not suggesting that Python generate it on-demand.

> 
>> If you want to make it so a system can install a package in just one
>> location to be used by multiple Python installations, then the version
>> number isn't enough.  You also need to distinguish debug builds, profiling
>> builds, Unicode width (see issue8654), and probably several other
>> ./configure options.
> 
> This is a good point, but more easily addressed.  Let's say a distro makes
> three Python 3.2 variants available, one "normal" build, a debug build, and
> UCS2 and USC4 versions of the above.  All we need to do is choose a different
> .so ABI tag (see previous follow) for each of those builds.  My updated patch
> (coming soon) allows you to define that tag to configure.  So e.g.

Why is this use case not already addressed by having independent
directories? And why is there an incentive to co-mingle these
version-punned files with version-agnostic ones?

> Mix and match for any other build options you care about.  Because the distro
> controls how Python is configured, this should be fairly easy to achieve.

For packages that have .so files, won't the distro already have to build
multiple copies of that package for all version of Python? So, why can't
it place them in separate directories that are version-specific at that
time? This is not the same as placing .py files that are
version-agnostic into a version-agnostic location.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From tjreedy at udel.edu  Fri Jun 25 00:00:30 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 24 Jun 2010 18:00:30 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <11597.1277401099@parc.com>
References: <11597.1277401099@parc.com>
Message-ID: <i00khu$fvg$1@dough.gmane.org>

On 6/24/2010 1:38 PM, Bill Janssen wrote:
>
> Secondly, maybe the string situation in 2.x wasn't as broken as we
> thought it was.  In particular, those who deal with lots of encoded
> strings seemed to find it handy, and miss it in 3.x.  Perhaps strings
> are more like numbers than we think.  We have separate types for int,
> float, Decimal, etc.  But they're all numbers, and they all
> cross-operate.

No they do not. Decimal only mixes properly with ints, but not with 
anything else, sometime with surprising and havoc-creating ways:
 >>> Decimal(0) == float(0)
False

I believe that and other comparisons may be fixed in 3.2, but I know 
there was lots of discussion of whether float + decimal should return a 
float or decimal, with good arguments both ways. To put it another way, 
there are potential problems with either choice. Automatic mixed-mode 
arithmetic is not always a slam-dunk, no-problem choise.

That aside, there are a couple of places where I think the comparison 
breaks down. If one adds a thousand ints and then a float, there is only 
the final number to convert. If one adds a thousand bytes and then a 
unicode, there is the concantenation of the thousand bytes to convert. 
Or short the result be the concatenation of a thousand unicode 
conversions. This brings up the distributivity (or not) of conversion 
over summation. In general, float(i) + float(j) = float(i+j), for i,j 
ints. I an not sure the same is true if i,j are bytes with some encoding 
and the conversion is unicode. Does it depend on the encoding?

-- 
Terry Jan Reedy


From ncoghlan at gmail.com  Fri Jun 25 00:01:38 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 Jun 2010 08:01:38 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100624170856.0853D3A4099@sparrow.telecommunity.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100624170856.0853D3A4099@sparrow.telecommunity.com>
Message-ID: <AANLkTilk9WEirzJAXdJ0LK-NfmpoBqvTjg3EuOqKGsp8@mail.gmail.com>

On Fri, Jun 25, 2010 at 3:07 AM, P.J. Eby <pje at telecommunity.com> wrote:
> (Btw, in some earlier emails, Stephen, you implied that this could be fixed
> with codecs -- but it can't, because the problem isn't with the bytes
> containing invalid Unicode, it's with the Unicode containing invalid bytes
> -- i.e., characters that can't be encoded to the ultimate codec target.)

That's what the surrogateescape error handler is for though - it will
happily accept mojibake on input (putting invalid bytes into the PUA),
and happily generate mojibake on output (recreating the invalid bytes
from the PUA) as well.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From guido at python.org  Fri Jun 25 00:01:46 2010
From: guido at python.org (Guido van Rossum)
Date: Thu, 24 Jun 2010 15:01:46 -0700
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTikI09E7-5_9ZOfhRr4i2UIVJ3IpHK7HqMiaqmV4@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com> 
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com> 
	<AANLkTikI09E7-5_9ZOfhRr4i2UIVJ3IpHK7HqMiaqmV4@mail.gmail.com>
Message-ID: <AANLkTikcfwUxQZ7zAFhkVeg5DaakJhDBbnfR3JbDsn83@mail.gmail.com>

On Thu, Jun 24, 2010 at 2:44 PM, Ian Bicking <ianb at colorstudy.com> wrote:
> I think we'll avoid a lot of the confusion that was present with Python 2 by
> not making the coercions transitive.? For instance, here's something that
> would work in Python 2:
>
> ? urlunsplit(('http', 'example.com', '/foo', u'bar=baz', ''))
>
> And you'd get out a unicode string, except that would break the first time
> that query string (u'bar=baz') was not ASCII (but not until then!)

Actually, that wouldn't be a problem. The problem would be this:

   urlunsplit(('http', 'example.com', u'/foo', 'bar=baz', ''))

(I moved the "u" prefix from bar=baz to /foo.) And this would break
when instead of baz there was some non-ASCII UTF-8, e.g.


urlunsplit(('http', 'example.com', u'/foo', 'bar=\xe1\x88\xb4', ''))
-- 
--Guido van Rossum (python.org/~guido)

From ncoghlan at gmail.com  Fri Jun 25 00:15:02 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 Jun 2010 08:15:02 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTinR08fCedIzZfG4It4FrTHF15S_m_wY-j1i5NmG@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>
	<AANLkTimztrpXVK2rK9r8qhAGU8Epyh_M0sZlzx6Jb_St@mail.gmail.com>
	<AANLkTinR08fCedIzZfG4It4FrTHF15S_m_wY-j1i5NmG@mail.gmail.com>
Message-ID: <AANLkTin7xaXjdU92eihkEJquI4x0mW7q1Reqpn8WTgbp@mail.gmail.com>

On Fri, Jun 25, 2010 at 1:41 AM, Guido van Rossum <guido at python.org> wrote:
> I don't think we should abuse sum for this. A simple idiom to get the
> *empty* string of a particular type is x[:0] so you could write
> something like this to concatenate a list or strings or bytes:
> xs[:0].join(xs). Note that if xs is empty we wouldn't know what to do
> anyway so this should be disallowed.

That's a good trick, although there's a "[0]" missing from your join
example ("type(xs[0])()" is another way to spell the same idea, but
the subscripting version would likely be faster since it skips the
builtin lookup). Promoting that over explicit use of empty str and
bytes literals is probably step 1 in eliminating gratuitous breakage
of bytes/str polymorphism (this trick also has the benefit of working
with non-builtin character sequence types).

Use of non-empty bytes/str literals is going to be harder to handle -
actually trying to apply a polymorphic philosophy to the Python 3 URL
parsing libraries may be a good way to learn more on that front.

Cheers,
Nick.

P.S. I'm off to Sydney for PyconAU this evening, so I'm not sure how
much time I'll get to follow python-dev until next week.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From tjreedy at udel.edu  Fri Jun 25 00:20:52 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 24 Jun 2010 18:20:52 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
Message-ID: <i00lo4$lir$1@dough.gmane.org>

On 6/24/2010 4:59 PM, Guido van Rossum wrote:

> But I wouldn't go so far as to claim that interpreting the protocols
> as text is wrong. After all we're talking exclusively about protocols
> that are designed intentionally to be directly "human readable"

I agree that the claim "':' is just a byte" is a bit shortsighted.

If the designers of the protocols had intended to use uninterpreted 
bytes as protocol markers, they could and I suspect would have used 
unused control codes, of which there are several. Then there would have 
been no need for escape mechanisms to put things like :<> into content text.

I am very sure that the reason for specifying *ascii* byte values was to 
be crysal clear as to what *character* was meant and to *exclude* use on 
the internet of the main imcompatible competitor encoding -- IBM's 
EBCDIC -- which IBM used in all of *its* networks. Until the IBM PC came 
out in the early 1980s (and IBM originally saw that as a minor sideline 
and something of a toy), there was a battle over byte encodings between 
IBM and everyone else.

-- 
Terry Jan Reedy


From mal at egenix.com  Fri Jun 25 00:35:05 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Fri, 25 Jun 2010 00:35:05 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23D3C2.1060500@scottdial.com>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>
Message-ID: <4C23DD99.9050604@egenix.com>

Scott Dial wrote:
> On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>>> What use case does this address?
>>
>>> If you want to make it so a system can install a package in just one
>>> location to be used by multiple Python installations, then the version
>>> number isn't enough.  You also need to distinguish debug builds, profiling
>>> builds, Unicode width (see issue8654), and probably several other
>>> ./configure options.
>>
>> This is a good point, but more easily addressed.  Let's say a distro makes
>> three Python 3.2 variants available, one "normal" build, a debug build, and
>> UCS2 and USC4 versions of the above.  All we need to do is choose a different
>> .so ABI tag (see previous follow) for each of those builds.  My updated patch
>> (coming soon) allows you to define that tag to configure.  So e.g.
> 
> Why is this use case not already addressed by having independent
> directories? And why is there an incentive to co-mingle these
> version-punned files with version-agnostic ones?

I don't think this is a good idea. After a while your Python
lib directories would need some serious dusting off to make them
maintainable again.

Disk space is cheap so setting up dedicated directories for each
variant will result in a much easier to manage installation.

If you want a really clever setup, use hard links between those
directory (you can also use symlinks if you like).
Then a change in one Python file will automatically
propagate to all other variant dirs without any maintenance
effort. Together with PYTHONHOME this makes a really nice
virtualenv-like environment.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 25 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                23 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Fri Jun 25 00:35:07 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 Jun 2010 08:35:07 +1000
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624115048.4fd152e3@heresy>
References: <20100624115048.4fd152e3@heresy>
Message-ID: <AANLkTikrQiXeGtStA7BjHeda7m0XFwdbUHapzReA4UuD@mail.gmail.com>

On Fri, Jun 25, 2010 at 1:50 AM, Barry Warsaw <barry at python.org> wrote:
> Please let me know what you think. ?I'm happy to just commit this to the py3k
> branch if there are no objections <wink>. ?I don't think a new PEP is in
> order, but an update to PEP 3147 might make sense.

I like the idea, but I think summarising the rest of this discussion
in its own (relatively short) PEP would be good (there are a few
things that are tricky - exact versioning scheme, PEP 384 forward
compatibility, impact on distutils, articulating the benefits for
distro packaging, etc).

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at thorne.id.au  Fri Jun 25 01:28:21 2010
From: stephen at thorne.id.au (Stephen Thorne)
Date: Fri, 25 Jun 2010 09:28:21 +1000
Subject: [Python-Dev] "2 or 3" link on python.org
Message-ID: <20100624232821.GB10805@thorne.id.au>

Steve Holden Wrote:
> Given the amount of interest this thread has generated I can't help
> wondering why it isn't more prominent in python.org content. Is the
> developer community completely disjoint with the web content editor
> community?
> 
> If there is such a disconnect we should think about remedying it: a
> large "Python 2 or 3?" button could link to a reasoned discussion of the
> pros and cons as evinced in this thread. That way people will end up
> with the right version more often (and be writing Python 2 that will
> more easily migrate to Python 3, if they cannot yet use 3).
> 
> There seems to be a perception that the PSF can help fund developments,
> and indeed Jesse Noller has made a small start with his sprint funding
> proposal (which now has some funding behind it). I think if it is to do
> so the Foundation will have to look for substantial new funding. I do
> not currently understand where this funding would come from, and would
> like to tap your developer creativity in helping to define how the
> Foundation can effectively commit more developer time to Python.
> 
> GSoC and GHOP are great examples, but there is plenty of room for all
> sorts of initiatives that result in development opportunities. I'd like
> to help.

I am extremely keen for this to happen. Does anyone have ownership of this
project? There was some discussion of it up-list but the discussion fizzled.

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue

From martin at v.loewis.de  Fri Jun 25 02:00:45 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 02:00:45 +0200
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <20100624232821.GB10805@thorne.id.au>
References: <20100624232821.GB10805@thorne.id.au>
Message-ID: <4C23F1AD.9040809@v.loewis.de>

Am 25.06.2010 01:28, schrieb Stephen Thorne:
> Steve Holden Wrote:
>> Given the amount of interest this thread has generated I can't help
>> wondering why it isn't more prominent in python.org content. Is the
>> developer community completely disjoint with the web content editor
>> community?
>>
>> If there is such a disconnect we should think about remedying it: a
>> large "Python 2 or 3?" button could link to a reasoned discussion of the
>> pros and cons as evinced in this thread. That way people will end up
>> with the right version more often (and be writing Python 2 that will
>> more easily migrate to Python 3, if they cannot yet use 3).
>>
>> There seems to be a perception that the PSF can help fund developments,
>> and indeed Jesse Noller has made a small start with his sprint funding
>> proposal (which now has some funding behind it). I think if it is to do
>> so the Foundation will have to look for substantial new funding. I do
>> not currently understand where this funding would come from, and would
>> like to tap your developer creativity in helping to define how the
>> Foundation can effectively commit more developer time to Python.
>>
>> GSoC and GHOP are great examples, but there is plenty of room for all
>> sorts of initiatives that result in development opportunities. I'd like
>> to help.
> 
> I am extremely keen for this to happen. Does anyone have ownership of this
> project? There was some discussion of it up-list but the discussion fizzled.

Can you please explain what "this project" is, in the context of your
message? GSoC? GHOP?

Regards,
Martin

From foom at fuhm.net  Fri Jun 25 02:23:51 2010
From: foom at fuhm.net (James Y Knight)
Date: Thu, 24 Jun 2010 20:23:51 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23D3C2.1060500@scottdial.com>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
Message-ID: <D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>


On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:

> On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>>> What use case does this address?
>>
>> Specifically, it's the use case where we (Debian/Ubuntu) plan on  
>> installing
>> all Python 3.x packages into /usr/lib/python3/dist-packages.  As of  
>> PEP 3147,
>> we can do that without collisions on the pyc files, but would still  
>> have to
>> symlink for extension module .so files, because they are always  
>> named foo.so
>> and Python 3.2's foo.so won't (modulo PEP 384) be compatible with  
>> Python 3.3's
>> foo.so.
>
> If the package has .so files that aren't compatible with other version
> of python, then what is the motivation for placing that in a shared
> location (since it can't actually be shared)

Because python looks for .so files in the same place it looks for  
the .py files of the same package. E.g., given a module like lxml, it  
contains the following files (among others):
lxml/
lxml/__init__.py
lxml/__init__.pyc
lxml/builder.py
lxml/builder.pyc
lxml/etree.so

And you can only put it in one place. Really, python should store  
the .py files in /usr/share/python/, the .so files in /usr/lib/x86_64- 
linux-gnu/python2.5-debug/, and the .pyc files in /var/lib/python2.5- 
debug. But python doesn't work like that.

James

From stephen at thorne.id.au  Fri Jun 25 02:31:49 2010
From: stephen at thorne.id.au (Stephen Thorne)
Date: Fri, 25 Jun 2010 10:31:49 +1000
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C23F1AD.9040809@v.loewis.de>
References: <20100624232821.GB10805@thorne.id.au>
	<4C23F1AD.9040809@v.loewis.de>
Message-ID: <20100625003149.GA16084@thorne.id.au>

On 2010-06-25, "Martin v. L?wis" wrote:
> Am 25.06.2010 01:28, schrieb Stephen Thorne:
> > Steve Holden Wrote:
> >> Given the amount of interest this thread has generated I can't help
> >> wondering why it isn't more prominent in python.org content. Is the
> >> developer community completely disjoint with the web content editor
> >> community?
> >>
> >> If there is such a disconnect we should think about remedying it: a
> >> large "Python 2 or 3?" button could link to a reasoned discussion of the
> >> pros and cons as evinced in this thread. That way people will end up
> >> with the right version more often (and be writing Python 2 that will
> >> more easily migrate to Python 3, if they cannot yet use 3).
> >>
> >> There seems to be a perception that the PSF can help fund developments,
> >> and indeed Jesse Noller has made a small start with his sprint funding
> >> proposal (which now has some funding behind it). I think if it is to do
> >> so the Foundation will have to look for substantial new funding. I do
> >> not currently understand where this funding would come from, and would
> >> like to tap your developer creativity in helping to define how the
> >> Foundation can effectively commit more developer time to Python.
> >>
> >> GSoC and GHOP are great examples, but there is plenty of room for all
> >> sorts of initiatives that result in development opportunities. I'd like
> >> to help.
> > 
> > I am extremely keen for this to happen. Does anyone have ownership of this
> > project? There was some discussion of it up-list but the discussion fizzled.
> 
> Can you please explain what "this project" is, in the context of your
> message? GSoC? GHOP?

Oh, I thought this was quite clear. I was specifically meaning the large
"Python 2 or 3" button on python.org. It would help users who want to know
what version of python to use if they had a clear guide as to what version
to download.

It doesn't help if someone goes to do greenfield development in python
if a library they depend upon has yet to be ported, and they're trying to
use python 3.

(As an addendum add pygtk to the list of libs that python 3 users on #python
are alarmed to find haven't been ported yet)

-- 
Regards,
Stephen Thorne
Development Engineer
Netbox Blue

From healey.rich at gmail.com  Fri Jun 25 02:51:18 2010
From: healey.rich at gmail.com (Rich Healey)
Date: Fri, 25 Jun 2010 10:51:18 +1000
Subject: [Python-Dev] docs - Copy
Message-ID: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>

http://docs.python.org/library/copy.html

Just near the bottom it reads:

"""Shallow copies of dictionaries can be made using?dict.copy(), and
of lists by assigning a slice of the entire list, for example,
copied_list?=?original_list[:]."""


Surely this is a typo? To my understanding, copied_list =
original_list[:] gives you a clean copy (slicing returns a new
object....)

Can this be updated? Or someone explain to me why it's correct?

Cheers

Example:


>>> t = [1, 2, 3]
>>> y = t
>>> u = t[:]
>>> y[1] = "rawr"
>>> t
[1, 'rawr', 3]
>>> u
[1, 2, 3]
>>>

From ben+python at benfinney.id.au  Fri Jun 25 02:54:30 2010
From: ben+python at benfinney.id.au (Ben Finney)
Date: Fri, 25 Jun 2010 10:54:30 +1000
Subject: [Python-Dev] FHS compliance of Python installation (was: versioned
	.so files for Python 3.2)
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
Message-ID: <876318lynt.fsf_-_@benfinney.id.au>

James Y Knight <foom at fuhm.net> writes:

> Really, python should store the .py files in /usr/share/python/, the
> .so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and the .pyc
> files in /var/lib/python2.5- debug. But python doesn't work like that.

+1

So who's going to draft the ?Filesystem Hierarchy Standard compliance?
PEP? :-)

-- 
 \     ?Having sex with Rachel is like going to a concert. She yells a |
  `\      lot, and throws frisbees around the room; and when she wants |
_o__)                        more, she lights a match.? ?Steven Wright |
Ben Finney


From steve at holdenweb.com  Fri Jun 25 02:58:41 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 24 Jun 2010 20:58:41 -0400
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <20100625003149.GA16084@thorne.id.au>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
Message-ID: <4C23FF41.5020006@holdenweb.com>

Stephen Thorne wrote:
> On 2010-06-25, "Martin v. L?wis" wrote:
>> Am 25.06.2010 01:28, schrieb Stephen Thorne:
>>> Steve Holden Wrote:
>>>> Given the amount of interest this thread has generated I can't help
>>>> wondering why it isn't more prominent in python.org content. Is the
>>>> developer community completely disjoint with the web content editor
>>>> community?
>>>>
>>>> If there is such a disconnect we should think about remedying it: a
>>>> large "Python 2 or 3?" button could link to a reasoned discussion of the
>>>> pros and cons as evinced in this thread. That way people will end up
>>>> with the right version more often (and be writing Python 2 that will
>>>> more easily migrate to Python 3, if they cannot yet use 3).
>>>>
>>>> There seems to be a perception that the PSF can help fund developments,
>>>> and indeed Jesse Noller has made a small start with his sprint funding
>>>> proposal (which now has some funding behind it). I think if it is to do
>>>> so the Foundation will have to look for substantial new funding. I do
>>>> not currently understand where this funding would come from, and would
>>>> like to tap your developer creativity in helping to define how the
>>>> Foundation can effectively commit more developer time to Python.
>>>>
>>>> GSoC and GHOP are great examples, but there is plenty of room for all
>>>> sorts of initiatives that result in development opportunities. I'd like
>>>> to help.
>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>> project? There was some discussion of it up-list but the discussion fizzled.
>> Can you please explain what "this project" is, in the context of your
>> message? GSoC? GHOP?
> 
> Oh, I thought this was quite clear. I was specifically meaning the large
> "Python 2 or 3" button on python.org. It would help users who want to know
> what version of python to use if they had a clear guide as to what version
> to download.
> 
> It doesn't help if someone goes to do greenfield development in python
> if a library they depend upon has yet to be ported, and they're trying to
> use python 3.
> 
> (As an addendum add pygtk to the list of libs that python 3 users on #python
> are alarmed to find haven't been ported yet)
> 
This topic really needs to go to the pydotorg list, as the guys there
maintain the site content. I know that Michael Foord is on both lists,
so he may be a good candidate for leading the charge, so to speak. This
topic is likely to assume increasing importance.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Fri Jun 25 02:58:41 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 24 Jun 2010 20:58:41 -0400
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <20100625003149.GA16084@thorne.id.au>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
Message-ID: <4C23FF41.5020006@holdenweb.com>

Stephen Thorne wrote:
> On 2010-06-25, "Martin v. L?wis" wrote:
>> Am 25.06.2010 01:28, schrieb Stephen Thorne:
>>> Steve Holden Wrote:
>>>> Given the amount of interest this thread has generated I can't help
>>>> wondering why it isn't more prominent in python.org content. Is the
>>>> developer community completely disjoint with the web content editor
>>>> community?
>>>>
>>>> If there is such a disconnect we should think about remedying it: a
>>>> large "Python 2 or 3?" button could link to a reasoned discussion of the
>>>> pros and cons as evinced in this thread. That way people will end up
>>>> with the right version more often (and be writing Python 2 that will
>>>> more easily migrate to Python 3, if they cannot yet use 3).
>>>>
>>>> There seems to be a perception that the PSF can help fund developments,
>>>> and indeed Jesse Noller has made a small start with his sprint funding
>>>> proposal (which now has some funding behind it). I think if it is to do
>>>> so the Foundation will have to look for substantial new funding. I do
>>>> not currently understand where this funding would come from, and would
>>>> like to tap your developer creativity in helping to define how the
>>>> Foundation can effectively commit more developer time to Python.
>>>>
>>>> GSoC and GHOP are great examples, but there is plenty of room for all
>>>> sorts of initiatives that result in development opportunities. I'd like
>>>> to help.
>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>> project? There was some discussion of it up-list but the discussion fizzled.
>> Can you please explain what "this project" is, in the context of your
>> message? GSoC? GHOP?
> 
> Oh, I thought this was quite clear. I was specifically meaning the large
> "Python 2 or 3" button on python.org. It would help users who want to know
> what version of python to use if they had a clear guide as to what version
> to download.
> 
> It doesn't help if someone goes to do greenfield development in python
> if a library they depend upon has yet to be ported, and they're trying to
> use python 3.
> 
> (As an addendum add pygtk to the list of libs that python 3 users on #python
> are alarmed to find haven't been ported yet)
> 
This topic really needs to go to the pydotorg list, as the guys there
maintain the site content. I know that Michael Foord is on both lists,
so he may be a good candidate for leading the charge, so to speak. This
topic is likely to assume increasing importance.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Fri Jun 25 03:04:03 2010
From: steve at holdenweb.com (Steve Holden)
Date: Thu, 24 Jun 2010 21:04:03 -0400
Subject: [Python-Dev] docs - Copy
In-Reply-To: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
Message-ID: <i00vb1$fsp$1@dough.gmane.org>

Rich Healey wrote:
> http://docs.python.org/library/copy.html
> 
> Just near the bottom it reads:
> 
> """Shallow copies of dictionaries can be made using dict.copy(), and
> of lists by assigning a slice of the entire list, for example,
> copied_list = original_list[:]."""
> 
> 
> Surely this is a typo? To my understanding, copied_list =
> original_list[:] gives you a clean copy (slicing returns a new
> object....)
> 
Yes, but it's a shallow copy: the new object references exactly the same
objects as the original list (not copies of those objects). A deep copy
would need to copy any referenced lists, and so on.

> Can this be updated? Or someone explain to me why it's correct?
> 
It sounds correct to me.

regards
 Steve


> Cheers
> 
> Example:
> 
> 
>>>> t = [1, 2, 3]
>>>> y = t
>>>> u = t[:]
>>>> y[1] = "rawr"
>>>> t
> [1, 'rawr', 3]
>>>> u
> [1, 2, 3]


-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From alexander.belopolsky at gmail.com  Fri Jun 25 03:05:09 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Thu, 24 Jun 2010 21:05:09 -0400
Subject: [Python-Dev] docs - Copy
In-Reply-To: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
Message-ID: <AANLkTim5q0SF1tmpCeYUZsFUXC0eQFQC9L8Ok7QbWF8J@mail.gmail.com>

On Thu, Jun 24, 2010 at 8:51 PM, Rich Healey <healey.rich at gmail.com> wrote:
> http://docs.python.org/library/copy.html
>
> Just near the bottom it reads:
>
> """Shallow copies of dictionaries can be made using?dict.copy(), and
> of lists by assigning a slice of the entire list, for example,
> copied_list?=?original_list[:]."""
>
>
> Surely this is a typo? To my understanding, copied_list =
> original_list[:] gives you a clean copy (slicing returns a new
> object....)
>

If you read the doc excerpt carefully, you will realize that it says
the same thing.  I agree that the language can be improved, though.
There is no need to bring in assignment to explain that a[:] makes a
copy of list a.   Please create a documentation issue at
http://bugs.python.org .  If you can suggest a better formulation, it
is likely to be accepted.

From greg.ewing at canterbury.ac.nz  Fri Jun 25 03:18:18 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 25 Jun 2010 13:18:18 +1200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23D3C2.1060500@scottdial.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
Message-ID: <4C2403DA.5000907@canterbury.ac.nz>

Scott Dial wrote:

> But the only motivation for doing this with .pyc files is that the .py
> files are able to be shared,

In an application made up of a mixture of pure Python and
extension modules, the .py files are able to be shared too.
Seems to me that a similar motivation exists here as well.
Not exactly the same, but closely related.

-- 
Greg

From healey.rich at gmail.com  Fri Jun 25 03:14:39 2010
From: healey.rich at gmail.com (Rich Healey)
Date: Fri, 25 Jun 2010 11:14:39 +1000
Subject: [Python-Dev] docs - Copy
In-Reply-To: <i00vb1$fsp$1@dough.gmane.org>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
	<i00vb1$fsp$1@dough.gmane.org>
Message-ID: <AANLkTikBCSm12hMFoPoZ7e6X1FShtZDQtwpOSW9eCD7-@mail.gmail.com>

On Fri, Jun 25, 2010 at 11:04 AM, Steve Holden <steve at holdenweb.com> wrote:
> Rich Healey wrote:
>> http://docs.python.org/library/copy.html
>>
>> Just near the bottom it reads:
>>
>> """Shallow copies of dictionaries can be made using dict.copy(), and
>> of lists by assigning a slice of the entire list, for example,
>> copied_list = original_list[:]."""
>>
>>
>> Surely this is a typo? To my understanding, copied_list =
>> original_list[:] gives you a clean copy (slicing returns a new
>> object....)
>>
> Yes, but it's a shallow copy: the new object references exactly the same
> objects as the original list (not copies of those objects). A deep copy
> would need to copy any referenced lists, and so on.
>

My apologies guys, I see now.

I will see if I can think of a less ambiguous way to word this and submit a bug.

Thankyou!

From tjreedy at udel.edu  Fri Jun 25 03:18:13 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Thu, 24 Jun 2010 21:18:13 -0400
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <20100625003149.GA16084@thorne.id.au>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
Message-ID: <i0104m$hrr$1@dough.gmane.org>

On 6/24/2010 8:31 PM, Stephen Thorne wrote:

> Oh, I thought this was quite clear. I was specifically meaning the large
> "Python 2 or 3" button on python.org. It would help users who want to know
> what version of python to use if they had a clear guide as to what version
> to download.

I think everyone on pydev agrees that that would be good, but I do 
believe anyone has taken ownership of the issue as yet. I am not sure 
who currently maintains the site and whether such are aware of the proposal.

I believe there is material on the wiki as well as the two existing 
pages on other sites that were discussed here. So a new page on 
python.org could consist of a few links. Someone just has to write it.
>
> It doesn't help if someone goes to do greenfield development in python
> if a library they depend upon has yet to be ported, and they're trying to
> use python 3.
>
> (As an addendum add pygtk to the list of libs that python 3 users on #python
> are alarmed to find haven't been ported yet)

The list, if it exists, should be on the wiki, where any registered user 
can edit it, rather than on the .org page.

I suspect that the feedback about Python on #python is somewhat 
different from that on python-list. I also suspect that some of it could 
be used to improve python, the docs, and the site. Is that happening 
much? I know I regularly open tracker issues (such as 6507, 8824, and 
8945) based on python-list discussions , and I know others have made 
wiki edits.


-- 
Terry Jan Reedy


From greg.ewing at canterbury.ac.nz  Fri Jun 25 03:28:14 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Fri, 25 Jun 2010 13:28:14 +1200
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <i00khu$fvg$1@dough.gmane.org>
References: <11597.1277401099@parc.com> <i00khu$fvg$1@dough.gmane.org>
Message-ID: <4C24062E.7040105@canterbury.ac.nz>

Terry Reedy wrote:
> On 6/24/2010 1:38 PM, Bill Janssen wrote:
> 
>> We have separate types for int,
>> float, Decimal, etc.  But they're all numbers, and they all
>> cross-operate.
> 
> No they do not. Decimal only mixes properly with ints, but not with 
> anything else

I think there are also some important differences between
numbers and strings concerning how they interact with C code.

In C there are really only two choices for representing a
Python number in a way that C code can directly operate on --
long or double -- and there is a set of functions for coercing a
Python object into one of these that C code almost universally
uses. So a new number type only has to implement the appropriate
conversion methods to be usable by all of that C code.

On the other hand, the existing C code that operates on Python
strings often assumes that it has a particular internal
representation. A new abstract string-access API would have to
be devised, and all existing C code updated to use it. Also,
this new API would not be as easy to use as the number API,
because it would involve asking for the data in some specified
encoding, which would require memory allocation and management.

-- 
Greg

From ncoghlan at gmail.com  Fri Jun 25 05:34:33 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Fri, 25 Jun 2010 13:34:33 +1000
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <i0104m$hrr$1@dough.gmane.org>
References: <20100624232821.GB10805@thorne.id.au>
	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
	<i0104m$hrr$1@dough.gmane.org>
Message-ID: <AANLkTimD90GqAagu2ckO_9jyM9uVrTzUAWmmgH5Mjwcs@mail.gmail.com>

On Fri, Jun 25, 2010 at 11:18 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> I believe there is material on the wiki as well as the two existing pages on
> other sites that were discussed here. So a new page on python.org could
> consist of a few links. Someone just has to write it.

There's material on the wiki *now* (the Python2orPython3 page), but
there wasn't before the recent discussion started. The whole
Beginner's Guide on the wiki could actually use some TLC to bring it
up to speed with the existence of Python 3.x.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From orsenthil at gmail.com  Fri Jun 25 06:54:07 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Fri, 25 Jun 2010 10:24:07 +0530
Subject: [Python-Dev] docs - Copy
In-Reply-To: <AANLkTim5q0SF1tmpCeYUZsFUXC0eQFQC9L8Ok7QbWF8J@mail.gmail.com>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
	<AANLkTim5q0SF1tmpCeYUZsFUXC0eQFQC9L8Ok7QbWF8J@mail.gmail.com>
Message-ID: <20100625045407.GA3191@remy>

On Thu, Jun 24, 2010 at 09:05:09PM -0400, Alexander Belopolsky wrote:
> On Thu, Jun 24, 2010 at 8:51 PM, Rich Healey <healey.rich at gmail.com> wrote:
> > http://docs.python.org/library/copy.html
> >
> > Just near the bottom it reads:
> >
> > """Shallow copies of dictionaries can be made using?dict.copy(), and
> > of lists by assigning a slice of the entire list, for example,
> > copied_list?=?original_list[:]."""
> >
> >
> > Surely this is a typo? To my understanding, copied_list =
> > original_list[:] gives you a clean copy (slicing returns a new
> > object....)
> >
> 
> the same thing.  I agree that the language can be improved, though.
> There is no need to bring in assignment to explain that a[:] makes a
> copy of list a.   Please create a documentation issue at

Better still, add your doc change suggestion (possible explanation) to
this issue:
http://bugs.python.org/issue9021


-- 
Senthil

From stephen at xemacs.org  Fri Jun 25 09:05:43 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 Jun 2010 16:05:43 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>
Message-ID: <878w63oam0.fsf@uwakimon.sk.tsukuba.ac.jp>

Guido van Rossum writes:
 > On Thu, Jun 24, 2010 at 1:12 AM, Stephen J. Turnbull <stephen at xemacs.org> wrote:

 > Understood, but both the majority of str/bytes methods and several
 > existing APIs (e.g. many in the os module, like os.listdir()) do it
 > this way.

Understood.

 > Also, IMO a polymorphic function should *not* accept *mixed*
 > bytes/text input -- join('x', b'y') should be rejected.

Agreed.

 > But join('x', 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make
 > sense to me.
 > 
 > So, actually, I *don't* understand what you mean by needing LBYL.

Consider docutils.  Some folks assert that URIs *are* bytes and should
be manipulated as such.  So base URIs should be bytes.  But there are
various ways to refer to a base URI and combine it with relative URI
taken from literal text in reST.  That literal text will be
represented as str.  So you want to use urljoin, but this usage isn't
polymorphic.

If you forget to do a conversion here, urljoin will raise, of course.
But late conversion may not be appropriate.  AIUI Philip at least
wants ways to raise exceptions earlier than that on some code paths.
That's LBYL, no?

From stephen at xemacs.org  Fri Jun 25 09:49:16 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 Jun 2010 16:49:16 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100624170856.0853D3A4099@sparrow.telecommunity.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100624170856.0853D3A4099@sparrow.telecommunity.com>
Message-ID: <877hlno8lf.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:

 > This doesn't have to be in the functions; it can be in the 
 > *types*.  Mixed-type string operations have to do type checking and 
 > upcasting already, but if the protocol were open, you could make an 
 > encoded-bytes type that would handle the error checking.

Don't you realize that "encoded-bytes" is equivalent to use of a very
limited profile of ISO 2022 coding extensions?  Such as Emacs/MULE
internal encoding or TRON code?  It has been tried.  It does not work.

I understand how types can do such checking; my point is that the
encoded-bytes type doesn't have enough information to do it in the
cases where you think it is better than converting to str.  There are
*no useful operations* that can be done on two encoded-bytes with
different encodings unless you know the ultimate target codec.  The
only sensible way to define the concatenation of ('ascii', 'English')
with ('euc-jp','??????') is something like ('ascii', 'English',
'euc-jp','??????'), and *not* ('euc-jp','English??????'), because you
don't know that the ultimate target codec is 'euc-jp'-compatible.
Worse, you need to build in all the information about which codecs are
mutually compatible into the encoded-bytes type.  For example, if the
ultimate target is known to be 'shift_jis', it's trivially compatible
with 'ascii' and 'euc-jp' requires a conversion, but latin-9 you can't
have.

 > (Btw, in some earlier emails, Stephen, you implied that this could be 
 > fixed with codecs -- but it can't, because the problem isn't with the 
 > bytes containing invalid Unicode, it's with the Unicode containing 
 > invalid bytes -- i.e., characters that can't be encoded to the 
 > ultimate codec target.)

No, the problem is not with the Unicode, it is with the code that
allows characters not encodable with the target codec.  If you don't
have a target codec, there are ascii-safe source codecs, such as
'latin-1' or 'ascii' with surrogateescape, that will work any time
that bytes-oriented processing can work.

From scott+python-dev at scottdial.com  Fri Jun 25 10:53:21 2010
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Fri, 25 Jun 2010 04:53:21 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
Message-ID: <4C246E81.3020302@scottdial.com>

On 6/24/2010 8:23 PM, James Y Knight wrote:
> On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:
>> If the package has .so files that aren't compatible with other version
>> of python, then what is the motivation for placing that in a shared
>> location (since it can't actually be shared)
> 
> Because python looks for .so files in the same place it looks for the
> .py files of the same package.

My suggestion was that a package that contains .so files should not be
shared (e.g., the entire lxml package should be placed in a
version-specific path). The motivation for this PEP was to simplify the
installation python packages for distros; it was not to reduce the
number of .py files on the disk.

Placing .so files together does not simplify that install process in any
way. You will still have to handle such packages in a special way. You
must still compile the package multiple times for each relevant version
of python (with special tagging that I imagine distutils can take care
of) and, worse yet, you have created a more trick install than merely
having multiple search paths (e.g., installing/uninstalling lxml for
*one* version of python is actually more difficult in this scheme).

Either the motivation for this PEP is inaccurate or I am failing to
understand how this is *simpler*. In the case of pure-python, this PEP
is clearly a win, but I have not seen an argument that it is a win for
.so files. Moreover, the PEP itself is titled "PYC Repository
Directories" (not "shared site-packages") and makes no mention of .so
files at all.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From scott+python-dev at scottdial.com  Fri Jun 25 11:02:24 2010
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Fri, 25 Jun 2010 05:02:24 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C2403DA.5000907@canterbury.ac.nz>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>
	<4C2403DA.5000907@canterbury.ac.nz>
Message-ID: <4C2470A0.4000802@scottdial.com>

On 6/24/2010 9:18 PM, Greg Ewing wrote:
> Scott Dial wrote:
> 
>> But the only motivation for doing this with .pyc files is that the .py
>> files are able to be shared,
> 
> In an application made up of a mixture of pure Python and
> extension modules, the .py files are able to be shared too.
> Seems to me that a similar motivation exists here as well.
> Not exactly the same, but closely related.
> 

If I recall Barry's motivation correctly, the PEP was intended to
simplify the installation of packages for multiple versions of Python,
although the PEP states that in a less direct way. In the case of
pure-python packages, this is merely about avoiding .pyc collisions.
But, in the case of packages with .so files, I fail to see how this is
simpler (in face, I believe it to be more complicated). So, I am not
sure the PEP supports this feature being proposed (since it makes no
mention of .so files), and more importantly, I am not sure it actually
makes anything better for anyone (still requires multiple compilations
and un/install gymnastics).

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From lvh at laurensvh.be  Fri Jun 25 11:18:18 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Fri, 25 Jun 2010 11:18:18 +0200
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <AANLkTimD90GqAagu2ckO_9jyM9uVrTzUAWmmgH5Mjwcs@mail.gmail.com>
References: <20100624232821.GB10805@thorne.id.au>
	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
	<i0104m$hrr$1@dough.gmane.org>
	<AANLkTimD90GqAagu2ckO_9jyM9uVrTzUAWmmgH5Mjwcs@mail.gmail.com>
Message-ID: <AANLkTil_c_jzQW3g5IsDYfaVT7BVL9v7GszUh7YUOHPu@mail.gmail.com>

On Fri, Jun 25, 2010 at 5:34 AM, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Fri, Jun 25, 2010 at 11:18 AM, Terry Reedy <tjreedy at udel.edu> wrote:
>> I believe there is material on the wiki as well as the two existing pages on
>> other sites that were discussed here. So a new page on python.org could
>> consist of a few links. Someone just has to write it.
>
> There's material on the wiki *now* (the Python2orPython3 page), but
> there wasn't before the recent discussion started. The whole
> Beginner's Guide on the wiki could actually use some TLC to bring it
> up to speed with the existence of Python 3.x.
>
> Cheers,
> Nick.
>

+1, this definitely sounds like a good idea to me.

cheers,
Laurens

From stephen at xemacs.org  Fri Jun 25 12:06:33 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Fri, 25 Jun 2010 19:06:33 +0900
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
Message-ID: <876317o28m.fsf@uwakimon.sk.tsukuba.ac.jp>

Ian Bicking writes:

 > We've setup a system where we think of text as natively unicode, with
 > encodings to put that unicode into a byte form.  This is certainly
 > appropriate in a lot of cases.  But there's a significant class of problems
 > where bytes are the native structure.  Network protocols are what we've been
 > discussing, and are a notable case of that.  That is, b'/' is the most
 > native sense of a path separator in a URL, or b':' is the most native sense
 > of what separates a header name from a header value in HTTP.

IMHO, URIs don't have a native language in this sense.  Network
programmers do, however, and it is bytes.  Text-handling programmers
also do, and it is str.

 > So with this idea in mind it makes more sense to me that *specific pieces of
 > text* can be reasonably treated as both bytes and text.  All the string
 > literals in urllib.parse.urlunspit() for example.
 > 
 > The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does not
 > become special('/x')) and special('/')+x=='/x' (again it becomes str).  This
 > avoids some of the cases of unicode or str infecting a system as they did in
 > Python 2 (where you might pass in unicode and everything works fine until
 > some non-ASCII is introduced).

I think you need to give explicit examples where this actually helps
in terms of "type contagion".  I expect that it doesn't help at all,
especially not for the people whose native language for URIs is bytes.
These specials are still going to flip to unicode as soon as it comes
in, and that will be incompatible with the bytes they'll need later.
So they're still going to need to filter out unicode on input.

It looks like it would be useful for programmers of polymorphic
functions, though.

From pje at telecommunity.com  Fri Jun 25 15:07:46 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 25 Jun 2010 09:07:46 -0400
Subject: [Python-Dev] bytes / unicode
Message-ID: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>

At 04:49 PM 6/25/2010 +0900, Stephen J. Turnbull wrote:
>P.J. Eby writes:
>
>  > This doesn't have to be in the functions; it can be in the
>  > *types*.  Mixed-type string operations have to do type checking and
>  > upcasting already, but if the protocol were open, you could make an
>  > encoded-bytes type that would handle the error checking.
>
>Don't you realize that "encoded-bytes" is equivalent to use of a very
>limited profile of ISO 2022 coding extensions?  Such as Emacs/MULE
>internal encoding or TRON code?  It has been tried.  It does not work.
>
>I understand how types can do such checking; my point is that the
>encoded-bytes type doesn't have enough information to do it in the
>cases where you think it is better than converting to str.  There are
>*no useful operations* that can be done on two encoded-bytes with
>different encodings unless you know the ultimate target codec.

I do know the ultimate target codec -- that's the point.

IOW, I want to be able to do to all my operations by passing 
target-encoded strings to polymorphic functions.  Then, the moment 
something creeps in that won't go to the target codec, I'll be able 
to track down the hole in the legacy code that's letting bad data creep in.


>   The
>only sensible way to define the concatenation of ('ascii', 'English')
>with ('euc-jp','??????') is something like ('ascii', 'English',
>'euc-jp','??????'), and *not* ('euc-jp','English??????'), because you
>don't know that the ultimate target codec is 'euc-jp'-compatible.
>Worse, you need to build in all the information about which codecs are
>mutually compatible into the encoded-bytes type.  For example, if the
>ultimate target is known to be 'shift_jis', it's trivially compatible
>with 'ascii' and 'euc-jp' requires a conversion, but latin-9 you can't
>have.

The interaction won't be with other encoded bytes, it'll be with 
other *unicode* strings.  Ones coming from other code, and literals 
embedded in the stdlib.


>No, the problem is not with the Unicode, it is with the code that
>allows characters not encodable with the target codec.

And which code that is, precisely, is the thing that may be very 
difficult to find, unless I can identify it at the first point it 
enters (and corrupts) my output data.  When dealing with a large code 
base, this may be a nontrivial problem.


From ianb at colorstudy.com  Fri Jun 25 17:35:44 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 25 Jun 2010 10:35:44 -0500
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <878w63oam0.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com> 
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net> 
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com> 
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com> 
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com> 
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan> 
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan> 
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com> 
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp> 
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com> 
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com> 
	<878w63oam0.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTinZvEpFEyXtZN_dLkUYehpX9HrIiy4CEezzuH6Z@mail.gmail.com>

On Fri, Jun 25, 2010 at 2:05 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> > But join('x', 'y') -> 'x/y' and join(b'x', b'y') -> b'x/y' make
>  > sense to me.
>  >
>  > So, actually, I *don't* understand what you mean by needing LBYL.
>
> Consider docutils.  Some folks assert that URIs *are* bytes and should
> be manipulated as such.  So base URIs should be bytes.


I don't get what you are arguing against.  Are you worried that if we make
URL code polymorphic that this will mean some code will treat URLs as bytes,
and that code will be incompatible with URLs as text?  No one is arguing we
remove text support from any of these functions, only that we allow bytes.


-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/dd2dcc5d/attachment.html>

From ianb at colorstudy.com  Fri Jun 25 17:40:56 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 25 Jun 2010 10:40:56 -0500
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <876317o28m.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com> 
	<876317o28m.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTimbkqkdxm7_cPYKYVW4SV7gwzfEV-NKSkNv5N0g@mail.gmail.com>

On Fri, Jun 25, 2010 at 5:06 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

>   > So with this idea in mind it makes more sense to me that *specific
> pieces of
>  > text* can be reasonably treated as both bytes and text.  All the string
>  > literals in urllib.parse.urlunspit() for example.
>  >
>  > The semantics I imagine are that special('/')+b'x'==b'/x' (i.e., it does
> not
>  > become special('/x')) and special('/')+x=='/x' (again it becomes str).
>  This
>  > avoids some of the cases of unicode or str infecting a system as they
> did in
>  > Python 2 (where you might pass in unicode and everything works fine
> until
>  > some non-ASCII is introduced).
>
> I think you need to give explicit examples where this actually helps
> in terms of "type contagion".  I expect that it doesn't help at all,
> especially not for the people whose native language for URIs is bytes.
> These specials are still going to flip to unicode as soon as it comes
> in, and that will be incompatible with the bytes they'll need later.
> So they're still going to need to filter out unicode on input.
>
> It looks like it would be useful for programmers of polymorphic
> functions, though.
>

I'm proposing these specials would be used in polymorphic functions, like
the functions in urllib.parse.  I would not personally use them in my own
code (unless of course I was writing my own polymorphic functions).

This also makes it less important that the objects be a full stand-in for
text, as their use should be isolated to specific functions, they aren't
objects that should be passed around much.  So you can easily identify and
quickly detect if you use unsupported operations on those text-like
objects.  (This is all a very different use case from bytes+encoding, I
think)

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/0d7f7e84/attachment.html>

From status at bugs.python.org  Fri Jun 25 18:08:26 2010
From: status at bugs.python.org (Python tracker)
Date: Fri, 25 Jun 2010 18:08:26 +0200 (CEST)
Subject: [Python-Dev] Summary of Python tracker Issues
Message-ID: <20100625160826.0C34078182@psf.upfronthosting.co.za>


ACTIVITY SUMMARY (2010-06-18 - 2010-06-25)
Python tracker at http://bugs.python.org/

To view or respond to any of the issues listed below, click on the issue 
number.  Do NOT respond to this message.


 2795 open (+38) / 18104 closed (+14) / 20899 total (+52)

Open issues with patches:  1130

Average duration of open issues: 712 days.
Median duration of open issues: 503 days.

Open Issues Breakdown
       open  2765 (+38)
languishing    13 ( +0)
    pending    16 ( +0)

Issues Created Or Reopened (55)
_______________________________

os.path.normcase documentation/behaviour unclear on Mac OS X   2010-06-25
       http://bugs.python.org/issue3485    reopened ezio.melotti                         
       patch                                                                   

uuid.uuid4() generates non-unique values on OSX                2010-06-21
       http://bugs.python.org/issue8621    reopened skrah                                
       patch                                                                   

test_support.run_unittest cmdline options and arguments        2010-06-20
       http://bugs.python.org/issue9028    reopened techtonik                            
                                                                               

errors='replace' works in IDLE, fails at Windows command line. 2010-06-18
       http://bugs.python.org/issue9029    created  jvanpraag                            
                                                                               

ctypes variable limits                                         2010-06-18
       http://bugs.python.org/issue9030    created  kumma                                
                                                                               

distutils uses invalid "-Wstrict-prototypes" flag when compili 2010-06-18
       http://bugs.python.org/issue9031    created  matteo.vescovi                       
                                                                               

xmlrpc: Transport.request() should also catch socket.error(EPI 2010-06-18
       http://bugs.python.org/issue9032    created  haypo                                
       patch                                                                   

cmd module tab misbehavior                                     2010-06-19
       http://bugs.python.org/issue9033    created  slcott                               
                                                                               

datetime module should use int32_t for date/time components    2010-06-20
       http://bugs.python.org/issue9034    created  belopolsky                           
                                                                               

os.path.ismount on windows doesn't support windows mount	point 2010-06-20
       http://bugs.python.org/issue9035    created  Oren_Held                            
                                                                               

Simplify Py_CHARMASK                                           2010-06-20
       http://bugs.python.org/issue9036    created  skrah                                
       patch, needs review                                                     

Add explanation as to how to raise a custom exception in the e 2010-06-20
       http://bugs.python.org/issue9037    created  jonathan.underwood                   
       patch                                                                   

test_distutils failure                                         2010-06-20
       http://bugs.python.org/issue9038    created  pitrou                               
                                                                               

IDLE and module Doc                                            2010-06-20
       http://bugs.python.org/issue9039    created  Yoda_Uchiha                          
                                                                               

using MIMEApplication to attach a PDF raises a TypeError excep 2010-06-21
       http://bugs.python.org/issue9040    created  Enrico.Sartori                       
                                                                               

raised exception is misleading                                 2010-06-21
       http://bugs.python.org/issue9041    created  kumma                                
                                                                               

Gettext cache and classes                                      2010-06-21
       http://bugs.python.org/issue9042    created  v_peter                              
       patch                                                                   

2to3 doesn't handle byte comparison well                       2010-06-21
CLOSED http://bugs.python.org/issue9043    created  vdupras                              
                                                                               

[optparse] confusion over an option and its value without any  2010-06-21
       http://bugs.python.org/issue9044    created  kszawala                             
                                                                               

2.7rc1: 64-bit OSX installer is not built with 64-bit tkinter  2010-06-21
       http://bugs.python.org/issue9045    created  srid                                 
                                                                               

Python 2.7rc2 doesn't build on Mac OS X 10.4                   2010-06-21
       http://bugs.python.org/issue9046    created  lemburg                              
                                                                               

Python 2.7rc2 includes -isysroot twice on each gcc command lin 2010-06-21
       http://bugs.python.org/issue9047    created  lemburg                              
                                                                               

no OS X buildbots in the stable list                           2010-06-21
       http://bugs.python.org/issue9048    created  janssen                              
       buildbot                                                                

UnboundLocalError in nested function                           2010-06-21
CLOSED http://bugs.python.org/issue9049    created  Andreas Hofmeister                   
                                                                               

UnboundLocalError in nested function                           2010-06-21
CLOSED http://bugs.python.org/issue9050    created  Andreas Hofmeister                   
                                                                               

Improve pickle format for aware datetime instances             2010-06-21
       http://bugs.python.org/issue9051    created  belopolsky                           
                                                                               

2.7rc2 fails test_urllib_localnet tests on OS X                2010-06-21
CLOSED http://bugs.python.org/issue9052    created  janssen                              
                                                                               

distutils compiles extensions so that Python.h cannot be found 2010-06-21
       http://bugs.python.org/issue9053    created  exarkun                              
                                                                               

pyexpat configured with "--with-system-expat" is incompatible  2010-06-21
       http://bugs.python.org/issue9054    created  dmalcolm                             
       patch                                                                   

test_issue_8959_b fails when run from a service                2010-06-21
       http://bugs.python.org/issue9055    created  pmoore                               
       buildbot                                                                

Adding additional level of bookmarks and section numbers in py 2010-06-22
       http://bugs.python.org/issue9056    created  pengyu.ut                            
                                                                               

Distutils2 needs a home page                                   2010-06-22
       http://bugs.python.org/issue9057    created  dabrahams                            
                                                                               

PyUnicodeDecodeError_Create asserts that various arguments are 2010-06-22
CLOSED http://bugs.python.org/issue9058    created  dmalcolm                             
       patch                                                                   

Backwards compatibility                                        2010-06-23
CLOSED http://bugs.python.org/issue9059    created  Raven                                
                                                                               

Python/dup2.c doesn't compile on (at least) newlib             2010-06-23
       http://bugs.python.org/issue9060    created  torne                                
       patch                                                                   

cgi.escape Can Lead To XSS Vulnerabilities                     2010-06-23
       http://bugs.python.org/issue9061    created  Craig.Younkins                       
                                                                               

urllib.urlopen crashes when launched from a thread             2010-06-23
CLOSED http://bugs.python.org/issue9062    created  olivier-berten                       
                                                                               

TZ examples in datetime.rst are incorrect                      2010-06-23
       http://bugs.python.org/issue9063    created  belopolsky                           
                                                                               

pdb enhancement up/down traversals                             2010-06-23
       http://bugs.python.org/issue9064    created  vandyswa                             
       patch                                                                   

tarfile:  default root:root ownership is incorrect.            2010-06-23
       http://bugs.python.org/issue9065    created  jsbronder                            
       patch                                                                   

Standard type codes for array.array, same as struct            2010-06-24
       http://bugs.python.org/issue9066    created  cmcqueen1975                         
                                                                               

Use macros from pyctype.h                                      2010-06-24
       http://bugs.python.org/issue9067    created  skrah                                
                                                                               

"from . import *"                                              2010-06-24
CLOSED http://bugs.python.org/issue9068    created  bhy                                  
                                                                               

test_float failure on Solaris                                  2010-06-24
       http://bugs.python.org/issue9069    created  mark.dickinson                       
                                                                               

Timestamps are rounded differently in py3k and trunk           2010-06-24
CLOSED http://bugs.python.org/issue9070    created  belopolsky                           
                                                                               

TarFile doesn't support member files with a leading "./"       2010-06-24
CLOSED http://bugs.python.org/issue9071    created  free.ekanayaka                       
                                                                               

Unloading modules - memleaks?                                  2010-06-24
CLOSED http://bugs.python.org/issue9072    created  yappie                               
                                                                               

Tkinter module missing from install on OS X 10.6.4             2010-06-24
       http://bugs.python.org/issue9073    created  RolandJ                              
                                                                               

[includes patch] subprocess module closes standard file descri 2010-06-24
       http://bugs.python.org/issue9074    created  kr                                   
       patch                                                                   

ssl module sets "debug" flag on SSL struct                     2010-06-24
CLOSED http://bugs.python.org/issue9075    created  pitrou                               
                                                                               

Add C-API documentation for PyUnicode_AsDecodedObject/Unicode  2010-06-24
       http://bugs.python.org/issue9076    created  haypo                                
       patch                                                                   

argparse does not handle arguments correctly after --          2010-06-24
CLOSED http://bugs.python.org/issue9077    created  iElectric                            
                                                                               

Fix C API documentation of unicode                             2010-06-24
       http://bugs.python.org/issue9078    created  haypo                                
       patch                                                                   

Make gettimeofday available in time module                     2010-06-25
       http://bugs.python.org/issue9079    created  belopolsky                           
       patch, needs review                                                     

Provide list prepend method (even though it's not efficient)   2010-06-25
CLOSED http://bugs.python.org/issue9080    created  andybuckley                          
                                                                               

Issues Now Closed (43)
______________________

MultiMethods with type annotations in 3000                     1035 days
       http://bugs.python.org/issue1004    benjamin.peterson                    
       patch                                                                   

subprocess.list2cmdline doesn't do pipe symbols                 975 days
       http://bugs.python.org/issue1300    chops at demiurgestudios.com            
       easy                                                                    

Popen.poll always returns None                                  816 days
       http://bugs.python.org/issue2475    tjreedy                              
                                                                               

Python interpreter uses Unicode surrogate pairs only before th  713 days
       http://bugs.python.org/issue3297    haypo                                
       patch                                                                   

py3k shouldn't use -fno-strict-aliasing anymore                 712 days
       http://bugs.python.org/issue3326    benjamin.peterson                    
       patch                                                                   

create a numbits() method for int and long types                699 days
       http://bugs.python.org/issue3439    mark.dickinson                       
       patch, needs review                                                     

os.path.realpath() get the wrong result                         554 days
       http://bugs.python.org/issue4654    r.david.murray                       
                                                                               

Compiling python 2.5.2 under Wine on linux.                     527 days
       http://bugs.python.org/issue4883    BreamoreBoy                          
                                                                               

3.0 sqlite doc: most examples refer to pysqlite2, use 2.x synt  516 days
       http://bugs.python.org/issue5005    tjreedy                              
                                                                               

Implement a way to change the python process name               448 days
       http://bugs.python.org/issue5672    piro                                 
       patch                                                                   

setup build with Platform SDK, finding vcvarsall.bat            407 days
       http://bugs.python.org/issue5969    georg.brandl                         
                                                                               

Failing test_signal.py on Redhat 4.1.2-44                       407 days
       http://bugs.python.org/issue5972    georg.brandl                         
                                                                               

datetime.strptime doesn't support %z format ?                     1 days
       http://bugs.python.org/issue6641    merwok                               
       patch                                                                   

webbrowser.get("firefox") does not work on Mac with installed   243 days
       http://bugs.python.org/issue7192    ronaldoussoren                       
       patch                                                                   

Backport 3.x nonlocal keyword to 2.7                            117 days
       http://bugs.python.org/issue8018    mark.dickinson                       
                                                                               

test_heapq interfering with test_import on py3k                  65 days
       http://bugs.python.org/issue8440    tim.golden                           
                                                                               

enumerate() test cases do not cover optional start argument      46 days
       http://bugs.python.org/issue8636    merwok                               
       patch                                                                   

_ssl.c uses PyWeakref_GetObject but doesn't incref result        45 days
       http://bugs.python.org/issue8682    pitrou                               
       patch                                                                   

Remove "w" format of PyParse_ParseTuple()                        27 days
       http://bugs.python.org/issue8850    haypo                                
       patch                                                                   

msvc9compiler.py: find_vcvarsall() doesn't work with VS2008 on   23 days
       http://bugs.python.org/issue8854    lemburg                              
       patch, 64bit                                                            

execfile does not work with UNC paths                            21 days
       http://bugs.python.org/issue8869    tim.golden                           
                                                                               

getargs.c: release the buffer on error                           18 days
       http://bugs.python.org/issue8926    haypo                                
       patch                                                                   

PyArg_Parse*(): "z" should not accept bytes                      16 days
       http://bugs.python.org/issue8949    haypo                                
       patch                                                                   

PyArg_Parse*(): factorize code of 's' and 'z' formats, and 'u'   16 days
       http://bugs.python.org/issue8951    haypo                                
       patch                                                                   

WINFUNCTYPE wrapped ctypes callbacks not functioning correctly   12 days
       http://bugs.python.org/issue8959    theller                              
                                                                               

Year range in timetuple                                           5 days
       http://bugs.python.org/issue9005    belopolsky                           
       patch                                                                   

os.path.normcase(None) does not raise an error on linux and sh    8 days
       http://bugs.python.org/issue9018    ezio.melotti                         
       patch, easy                                                             

2to3 doesn't handle byte comparison well                          0 days
       http://bugs.python.org/issue9043    merwok                               
                                                                               

UnboundLocalError in nested function                              1 days
       http://bugs.python.org/issue9049    mark.dickinson                       
                                                                               

UnboundLocalError in nested function                              0 days
       http://bugs.python.org/issue9050    merwok                               
                                                                               

2.7rc2 fails test_urllib_localnet tests on OS X                   0 days
       http://bugs.python.org/issue9052    belopolsky                           
                                                                               

PyUnicodeDecodeError_Create asserts that various arguments are    0 days
       http://bugs.python.org/issue9058    benjamin.peterson                    
       patch                                                                   

Backwards compatibility                                           0 days
       http://bugs.python.org/issue9059    ezio.melotti                         
                                                                               

urllib.urlopen crashes when launched from a thread                0 days
       http://bugs.python.org/issue9062    orsenthil                            
                                                                               

"from . import *"                                                 0 days
       http://bugs.python.org/issue9068    brett.cannon                         
                                                                               

Timestamps are rounded differently in py3k and trunk              0 days
       http://bugs.python.org/issue9070    belopolsky                           
                                                                               

TarFile doesn't support member files with a leading "./"          1 days
       http://bugs.python.org/issue9071    free.ekanayaka                       
                                                                               

Unloading modules - memleaks?                                     0 days
       http://bugs.python.org/issue9072    yappie                               
                                                                               

ssl module sets "debug" flag on SSL struct                        0 days
       http://bugs.python.org/issue9075    pitrou                               
                                                                               

argparse does not handle arguments correctly after --             1 days
       http://bugs.python.org/issue9077    iElectric                            
                                                                               

Provide list prepend method (even though it's not efficient)      0 days
       http://bugs.python.org/issue9080    andybuckley                          
                                                                               

webbrowser.open_new() opens in an existing browser window      2463 days
       http://bugs.python.org/issue812089  r.david.murray                       
                                                                               

mbcs encoding ignores errors                                   2394 days
       http://bugs.python.org/issue850997  haypo                                
       patch                                                                   


Top Issues Most Discussed (10)
______________________________

 19 Non-uniformity in randrange for large arguments.                   7 days
open        http://bugs.python.org/issue9025   

 19 2.7: eval hangs on AIX                                             8 days
open        http://bugs.python.org/issue9020   

 17 Python 2.7rc2 doesn't build on Mac OS X 10.4                       4 days
open        http://bugs.python.org/issue9046   

 14 test_float failure on Solaris                                      1 days
open        http://bugs.python.org/issue9069   

 13 msvc9compiler.py: find_vcvarsall() doesn't work with VS2008 on    23 days
closed      http://bugs.python.org/issue8854   

 10 no OS X buildbots in the stable list                               4 days
open        http://bugs.python.org/issue9048   

 10 os.path.normcase(None) does not raise an error on linux and sho    8 days
closed      http://bugs.python.org/issue9018   

  8 Provide list prepend method (even though it's not efficient)       0 days
closed      http://bugs.python.org/issue9080   

  8 Improve quality of Python/dtoa.c                                   9 days
open        http://bugs.python.org/issue9009   

  8 Add Mercurial support to patchcheck                               10 days
open        http://bugs.python.org/issue8999   


From barry at python.org  Fri Jun 25 18:18:47 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 25 Jun 2010 12:18:47 -0400
Subject: [Python-Dev] Schedule for Python 2.6.6
Message-ID: <20100625121847.60331d9e@heresy>

Benjamin is still planning to release Python 2.7 final on 2010-07-03, so it's
time for me to work out the release schedule for Python 2.6.6 - likely the
last maintenance release for Python 2.6.

Because summer schedules are crazy, and I want to leave two weeks between
2.6.6 rc1 and 2.6.6 final, my current schedule looks like:

* Python 2.6.6 rc 1 on Monday 2010-08-02
* Python 2.6.6 final on Monday 2010-08-16

This should give folks plenty of time to relax after 2.7 final, and still be
able to get those last minute fixes into the 2.6 tree.

Let me know if these dates don't work for you.
-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/54ad7481/attachment-0001.pgp>

From stephen at xemacs.org  Fri Jun 25 18:18:33 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 Jun 2010 01:18:33 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
Message-ID: <8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:

 > I do know the ultimate target codec -- that's the point.
 > 
 > IOW, I want to be able to do to all my operations by passing
 > target-encoded strings to polymorphic functions.

IOW, you *do* have text and (ignoring efficiency issues) could just as
well use str.  But That Other Code is unreliable, so you need a marker
for your own internal strings indicating that they are validated,
while other strings are not.

This has nothing to do with bytes vs. str as string types, then; it's
all about validated (which your architecture indicates by using the
bytes type) vs. unvalidated (which your architecture indicates with
unicode).  Eg, in the case of your USPS vs. ecommerce example, you
can't even handle all bytes, so not all possible bytes objects are
valid.  And other applications might not be able to handle all
Japanese, but only a subset, so having valid EUC-JP wouldn't be
enough, you'd have to check repertoire -- might as well use str.

It seems to me what is wanted here is something like Perl's taint
mechanism, for *both* kinds of strings.  Am I missing something?

But with your architecture, it seems to me that you actually don't
want polymorphic functions in the stdlib.  You want the stdlib
functions to be bytes-oriented if and only if they are reliable.  (This
is what I was saying to Guido elsewhere.)

BTW, this was a little unclear to me:

 > [Collisions will] be with other *unicode* strings.  Ones coming
 > from other code, and literals embedded in the stdlib.

What about the literals in the stdlib?  Are you saying they contain
invalid code points for your known output encoding?  Or are you saying
that with non-polymorphic unicode stdlib, you get lots of false
positives when combining with your validated bytes?


From barry at python.org  Fri Jun 25 18:28:29 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 25 Jun 2010 12:28:29 -0400
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <20100625121847.60331d9e@heresy>
References: <20100625121847.60331d9e@heresy>
Message-ID: <20100625122829.30b20e67@heresy>

On Jun 25, 2010, at 12:18 PM, Barry Warsaw wrote:

>* Python 2.6.6 rc 1 on Monday 2010-08-02
>* Python 2.6.6 final on Monday 2010-08-16

I've also updated the Google calendar of Python releases:

http://www.google.com/calendar/ical/b6v58qvojllt0i6ql654r1vh00%40group.calendar.google.com/public/basic.ics

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/20cc8b70/attachment.pgp>

From stephen at xemacs.org  Fri Jun 25 18:30:08 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 Jun 2010 01:30:08 +0900
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTimbkqkdxm7_cPYKYVW4SV7gwzfEV-NKSkNv5N0g@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<876317o28m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimbkqkdxm7_cPYKYVW4SV7gwzfEV-NKSkNv5N0g@mail.gmail.com>
Message-ID: <871vbvnkhb.fsf@uwakimon.sk.tsukuba.ac.jp>

Ian Bicking writes:

 > I'm proposing these specials would be used in polymorphic functions, like
 > the functions in urllib.parse.  I would not personally use them in my own
 > code (unless of course I was writing my own polymorphic functions).
 > 
 > This also makes it less important that the objects be a full stand-in for
 > text, as their use should be isolated to specific functions, they aren't
 > objects that should be passed around much.  So you can easily identify and
 > quickly detect if you use unsupported operations on those text-like
 > objects.

OK.  That sounds reasonable to me, but I don't see any need for
a builtin type for it.  Inclusion in the stdlib is not quite a
no-brainer, but given Guido's endorsement of polymorphism, I can't
bring myself to go lower than +0.9 <wink>.

 > (This is all a very different use case from bytes+encoding, I think)

Very much so.

From stephen at xemacs.org  Fri Jun 25 18:37:58 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 Jun 2010 01:37:58 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTinZvEpFEyXtZN_dLkUYehpX9HrIiy4CEezzuH6Z@mail.gmail.com>
References: <AANLkTin6vAzrEO0HxoNpTmUs35sW2nXqIK0U1S7HVOCE@mail.gmail.com>
	<20100620184120.10EFB3A4099@sparrow.telecommunity.com>
	<20100620234723.600ad4a8@pitrou.net>
	<AANLkTimQRL4kj-iaklqwvhSEVBEZXTp_dRhmScYqAzgb@mail.gmail.com>
	<AANLkTinm0tJg-DQKy0eBZftWOsKadhq5XiclKYmPohq9@mail.gmail.com>
	<87wrtsd44p.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTik0Gz5SvNs-YaaBEjvszd643chKYs7B9obU3u6M@mail.gmail.com>
	<87631c4bca.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100621165611.GW5787@unaka.lan>
	<87r5jz3h8u.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100622055040.GE5787@unaka.lan>
	<87d3vj2tj2.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimsArm27KcchLWzj-AMMPlAlHJWsqGGNOoUsTPF@mail.gmail.com>
	<0D1D2134-2CF9-4F93-BE82-912C5297D36F@fuhm.net>
	<87zkymns55.fsf@uwakimon.sk.tsukuba.ac.jp>
	<hvt9af$t6n$1@dough.gmane.org>
	<AANLkTimjunQtAe9qlqpFnKHJqCNPkid0L1334CXdlZTr@mail.gmail.com>
	<87mxukonmq.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTikRXKoCgr9eGTxVq5KXuGpRtdtxRMhLovsXE-bP@mail.gmail.com>
	<878w63oam0.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTinZvEpFEyXtZN_dLkUYehpX9HrIiy4CEezzuH6Z@mail.gmail.com>
Message-ID: <87zkyjm5jt.fsf@uwakimon.sk.tsukuba.ac.jp>

Ian Bicking writes:

 > I don't get what you are arguing against.  Are you worried that if
 > we make URL code polymorphic that this will mean some code will
 > treat URLs as bytes, and that code will be incompatible with URLs
 > as text?  No one is arguing we remove text support from any of
 > these functions, only that we allow bytes.

No, I understand what Guido means by "polymorphic".

I'm arguing that as I understand one of Philip Eby's use cases,
"bytes" is a misspelling of "validated" and "unicode" is a misspelling
of "unvalidated".  In case of some kind of bug, polymorphic stdlib
functions would allow propagation of unvalidated/unicode within the
validated zone, aka "errors passing silently".

Now that I understand that that use case doesn't actually care about
bytes vs. unicode *string* semantics at all, the argument becomes
moot, I guess.


From ianb at colorstudy.com  Fri Jun 25 18:54:05 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 25 Jun 2010 11:54:05 -0500
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <871vbvnkhb.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com> 
	<876317o28m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<AANLkTimbkqkdxm7_cPYKYVW4SV7gwzfEV-NKSkNv5N0g@mail.gmail.com> 
	<871vbvnkhb.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <AANLkTikkeO0tK0Z5fHEU4dPNNWIyIHAdip40HNFqLfba@mail.gmail.com>

On Fri, Jun 25, 2010 at 11:30 AM, Stephen J. Turnbull <stephen at xemacs.org>wrote:

> Ian Bicking writes:
>
>  > I'm proposing these specials would be used in polymorphic functions,
> like
>  > the functions in urllib.parse.  I would not personally use them in my
> own
>  > code (unless of course I was writing my own polymorphic functions).
>  >
>  > This also makes it less important that the objects be a full stand-in
> for
>  > text, as their use should be isolated to specific functions, they aren't
>  > objects that should be passed around much.  So you can easily identify
> and
>  > quickly detect if you use unsupported operations on those text-like
>  > objects.
>
> OK.  That sounds reasonable to me, but I don't see any need for
> a builtin type for it.  Inclusion in the stdlib is not quite a
> no-brainer, but given Guido's endorsement of polymorphism, I can't
> bring myself to go lower than +0.9 <wink>.
>

Agreed on a builtin; I think it would be fine to put something in the
strings module, and then in these examples code that used '/' would instead
use strings.ascii('/') (not sure so sure of what the name should be though).


-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/9a846bd3/attachment.html>

From tjreedy at udel.edu  Fri Jun 25 18:57:50 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Fri, 25 Jun 2010 12:57:50 -0400
Subject: [Python-Dev] docs - Copy
In-Reply-To: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
Message-ID: <i02n6d$1ao$2@dough.gmane.org>

On 6/24/2010 8:51 PM, Rich Healey wrote:
> http://docs.python.org/library/copy.html

Discussion of the wording of current docs should go to python-list. 
Py-dev is for development of future Python.


-- 
Terry Jan Reedy


From fuzzyman at voidspace.org.uk  Fri Jun 25 20:35:35 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 25 Jun 2010 19:35:35 +0100
Subject: [Python-Dev] Creating APIs that work as both decorators and context
	managers
Message-ID: <4C24F6F7.4040200@voidspace.org.uk>

Hello all,

I've put a recipe up on the Python cookbook for creating APIs that work 
as both decorators and context managers and wonder if it would be 
considered a useful addition to the functools module.

http://code.activestate.com/recipes/577273-decorator-and-context-manager-from-a-single-api/

I wrote this after writing almost identical code the second time for 
"patch" in the mock module. (The patch decorator can be used as a 
decorator or as a context manager and I was writing a new variant.) Both 
py.test and django have similar code in places, so it is not an uncommon 
pattern.

It is only 40 odd lines (ignore the ugly Python 2 & 3 compatibility 
hack), so I'm fine with it living on the cookbook - but it is at least 
slightly fiddly to write and has the added niceness of providing the 
optional exception handling semantics of __exit__ for decorators as well.

Example use (really hope email doesn't swallow the whitespace - my 
apologies in advance if it does):

from context import Context

class mycontext(Context):
     def __init__(self, *args):
         """Normal initialiser"""

     def start(self):
         """
         Called on entering the with block or starting the decorated 
function.

         If used in a with statement whatever this method returns will 
be the
         context manager.
         """

     def finish(self, *exc):
         """
         Called on exit. Arguments and return value of this method have
         the same meaning as the __exit__ method of a normal context
         manager.
         """

@mycontext('some', 'args')
def function():
     pass

with mycontext('some', 'args') as something:
     pass

I'm not entirely happy with the name of the class or the start and 
finish methods, so open to suggestions there. start and finish *could* 
be __enter__ and __exit__ - but that would make the class you implement 
*look* like a normal context manager and I thought it was better to 
distinguish them. Perhaps before and after?

All the best,

Michael Foord

-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.	


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/33717a14/attachment.html>

From brett at python.org  Fri Jun 25 20:58:42 2010
From: brett at python.org (Brett Cannon)
Date: Fri, 25 Jun 2010 11:58:42 -0700
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C246E81.3020302@scottdial.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com> 
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com> 
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
Message-ID: <AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>

On Fri, Jun 25, 2010 at 01:53, Scott Dial
<scott+python-dev at scottdial.com> wrote:
> On 6/24/2010 8:23 PM, James Y Knight wrote:
>> On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:
>>> If the package has .so files that aren't compatible with other version
>>> of python, then what is the motivation for placing that in a shared
>>> location (since it can't actually be shared)
>>
>> Because python looks for .so files in the same place it looks for the
>> .py files of the same package.
>
> My suggestion was that a package that contains .so files should not be
> shared (e.g., the entire lxml package should be placed in a
> version-specific path). The motivation for this PEP was to simplify the
> installation python packages for distros; it was not to reduce the
> number of .py files on the disk.

I assume you are talking about PEP 3147. You're right that the PEP was
for pyc files and that's it. No one is talking about rewriting the
PEP. The motivation Barry is using is an overarching one of distros
wanting to use a single directory install location for all installed
Python versions. That led to PEP 3147 and now this work.

>
> Placing .so files together does not simplify that install process in any
> way. You will still have to handle such packages in a special way. You
> must still compile the package multiple times for each relevant version
> of python (with special tagging that I imagine distutils can take care
> of) and, worse yet, you have created a more trick install than merely
> having multiple search paths (e.g., installing/uninstalling lxml for
> *one* version of python is actually more difficult in this scheme).

This is meant to be used by distros in a programmatic fashion, so my
response is "so what?" Their package management system is going to
maintain the directory, not a person. You and I are not going to be
using this for anything. This is purely meant for Linux OS vendors
(maybe OS X) to manage their installs through their package software.
I honestly do not expect human beings to be mucking around with these
installs (and I suspect Barry doesn't either).

>
> Either the motivation for this PEP is inaccurate or I am failing to
> understand how this is *simpler*. In the case of pure-python, this PEP
> is clearly a win, but I have not seen an argument that it is a win for
> .so files. Moreover, the PEP itself is titled "PYC Repository
> Directories" (not "shared site-packages") and makes no mention of .so
> files at all.

You're conflating what is being discussed with PEP 3147. That PEP is
independent of this. PEP 3147 just empowered this work to be relevant.

-Brett

>
> --
> Scott Dial
> scott at scottdial.com
> scodial at cs.indiana.edu
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/brett%40python.org
>

From scott+python-dev at scottdial.com  Fri Jun 25 21:42:38 2010
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Fri, 25 Jun 2010 15:42:38 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
	<AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
Message-ID: <4C2506AE.3060002@scottdial.com>

On 6/25/2010 2:58 PM, Brett Cannon wrote:
> I assume you are talking about PEP 3147. You're right that the PEP was
> for pyc files and that's it. No one is talking about rewriting the
> PEP.

Yes, I am making reference to PEP 3147. I make reference to that PEP
because this change is of the same order of magnitude as the .pyc
change, and we asked for a PEP for that, and if this .so stuff is an
extension of that thought process, then it should either be reflected by
that PEP or a new PEP.

> The motivation Barry is using is an overarching one of distros
> wanting to use a single directory install location for all installed
> Python versions. That led to PEP 3147 and now this work.

It's unclear to me that that is the correct motivation, which you are
divining. As I understand it, the motivation to be to *simplify
installation* for distros, which may or may not be achieved by using a
single directory. In the case of pure-python packages, a single
directory is an obvious win. In the case of mixed-python packages, I
remain to be persuaded there is any improvement achieved.

> This is meant to be used by distros in a programmatic fashion, so my
> response is "so what?" Their package management system is going to
> maintain the directory, not a person.

Then why is the status quo unacceptable? I have already explained how
this will still require programmatic steps of at least the same
difficulty as the status quo requires, so why should we change anything?

I am skeptical that this is a simple programmatic problem either: take
any random package on PyPI and tell me whether or not it has a .so file
that must be compiled. If such a .so file exists, then this package must
be special-cased and compiled for each version of Python on the system
(or will ever be on the system?). Such a package yields an arbitrary
number of .so files due to the number of version of Python on the
machine, and I can't imagine how it is simpler to manage all of those
files than it is to manage multiple site-packages.

> You're conflating what is being discussed with PEP 3147. That PEP is
> independent of this. PEP 3147 just empowered this work to be relevant.

Without a PEP (be it PEP 3147 or some other), what is the justification
for doing this? The burden should be on "you" to explain why this is a
good idea and not just a clever idea.

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From dickinsm at gmail.com  Fri Jun 25 22:02:36 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Fri, 25 Jun 2010 21:02:36 +0100
Subject: [Python-Dev] Creating APIs that work as both decorators and
	context managers
In-Reply-To: <4C24F6F7.4040200@voidspace.org.uk>
References: <4C24F6F7.4040200@voidspace.org.uk>
Message-ID: <AANLkTinh76WLB3uWjsry62G_fj6TH7DIKvLx14RUK7bH@mail.gmail.com>

On Fri, Jun 25, 2010 at 7:35 PM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
> Hello all,
>
> I've put a recipe up on the Python cookbook for creating APIs that work as
> both decorators and context managers and wonder if it would be considered a
> useful addition to the functools module.
> http://code.activestate.com/recipes/577273-decorator-and-context-manager-from-a-single-api/

It's an interesting idea.  I wanted almost exactly this a little while
ago, while doing some experiments to add an IEEE 754-compliance
wrapper to the decimal module (for my own use).  It seems quite
natural that one might want to wrap both functions and blocks in the
same way.

[1] In case anyone wants the details, this was for a
'delay-exceptions' operation, that allows you to execute some number
of arithmetic operations, keeping track of the floating-point signals
that they produce but not raising the corresponding exceptions until
the end of the block;  obviously this idea applies equally well to
functions as to blocks.  It's one of the recommended exception
handling modes from section 8 of IEEE 754-2008.

Mark

From foom at fuhm.net  Fri Jun 25 22:12:34 2010
From: foom at fuhm.net (James Y Knight)
Date: Fri, 25 Jun 2010 16:12:34 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C246E81.3020302@scottdial.com>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
Message-ID: <A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>


On Jun 25, 2010, at 4:53 AM, Scott Dial wrote:

> On 6/24/2010 8:23 PM, James Y Knight wrote:
>> On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:
>>> If the package has .so files that aren't compatible with other  
>>> version
>>> of python, then what is the motivation for placing that in a shared
>>> location (since it can't actually be shared)
>>
>> Because python looks for .so files in the same place it looks for the
>> .py files of the same package.
>
> My suggestion was that a package that contains .so files should not be
> shared (e.g., the entire lxml package should be placed in a
> version-specific path). The motivation for this PEP was to simplify  
> the
> installation python packages for distros; it was not to reduce the
> number of .py files on the disk.
>
> Placing .so files together does not simplify that install process in  
> any
> way. You will still have to handle such packages in a special way.


This is a good point, but I think still falls short of a solution. For  
a package like lxml, indeed you are correct. Since debian needs to  
build it once per version, it could just put the entire package (.py  
files and .so files) into a different per-python-version directory.

However, then you have to also consider python packages made up of  
multiple distro packages -- like twisted or zope. Twisted includes  
some C extensions in the core package. But then there are other  
twisted modules (installed under a "twisted.foo" name) which do not  
include C extensions. If the base twisted package is installed under a  
version-specific directory, then all of the submodule packages need to  
also be installed under the same version-specific directory (and thus  
built for all versions).

In the past, it has proven somewhat tricky to coordinate which  
directory the modules for package "foo" should be installed in,  
because you need to know whether *any* of the related packages  
includes a native ".so" file, not just the current package.

The converse situation, where a base package did *not* get installed  
into a version-specific directory because it includes no native code,  
but a submodule *does* include a ".so" file, is even trickier.

James

From martin at v.loewis.de  Fri Jun 25 22:27:31 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 22:27:31 +0200
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <20100625003149.GA16084@thorne.id.au>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
Message-ID: <4C251133.2090505@v.loewis.de>

>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>> project? There was some discussion of it up-list but the discussion fizzled.
>>
>> Can you please explain what "this project" is, in the context of your
>> message? GSoC? GHOP?
> 
> Oh, I thought this was quite clear. I was specifically meaning the large
> "Python 2 or 3" button on python.org. It would help users who want to know
> what version of python to use if they had a clear guide as to what version
> to download.

Ah, ok. No, nobody has taken ownership of that project, and likely,
nobody actually will - unless you volunteer.

Regards,
Martin

From martin at v.loewis.de  Fri Jun 25 22:30:34 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 22:30:34 +0200
Subject: [Python-Dev] docs - Copy
In-Reply-To: <i02n6d$1ao$2@dough.gmane.org>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>
	<i02n6d$1ao$2@dough.gmane.org>
Message-ID: <4C2511EA.3000200@v.loewis.de>

Am 25.06.2010 18:57, schrieb Terry Reedy:
> On 6/24/2010 8:51 PM, Rich Healey wrote:
>> http://docs.python.org/library/copy.html
> 
> Discussion of the wording of current docs should go to python-list.
> Py-dev is for development of future Python.

No no no. Mis-worded documentation is a bug, just like any other bug,
and deserves being discussed here.

Furthermore, a sufficient condition for mis-wording is if a user read it
in full, and still managed to misunderstand (as happened here).

Regards,
Martin

From martin at v.loewis.de  Fri Jun 25 22:31:28 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 22:31:28 +0200
Subject: [Python-Dev] docs - Copy
In-Reply-To: <AANLkTikBCSm12hMFoPoZ7e6X1FShtZDQtwpOSW9eCD7-@mail.gmail.com>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>	<i00vb1$fsp$1@dough.gmane.org>
	<AANLkTikBCSm12hMFoPoZ7e6X1FShtZDQtwpOSW9eCD7-@mail.gmail.com>
Message-ID: <4C251220.3050106@v.loewis.de>

> My apologies guys, I see now.
> 
> I will see if I can think of a less ambiguous way to word this and submit a bug.

Please don't take out or rephrase the word "shallow", though. This has a
long CS tradition of meaning exactly what is meant here.

Regards,
Martin

From martin at v.loewis.de  Fri Jun 25 22:33:38 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 22:33:38 +0200
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <20100625121847.60331d9e@heresy>
References: <20100625121847.60331d9e@heresy>
Message-ID: <4C2512A2.1040404@v.loewis.de>

Am 25.06.2010 18:18, schrieb Barry Warsaw:
> Benjamin is still planning to release Python 2.7 final on 2010-07-03, so it's
> time for me to work out the release schedule for Python 2.6.6 - likely the
> last maintenance release for Python 2.6.
> 
> Because summer schedules are crazy, and I want to leave two weeks between
> 2.6.6 rc1 and 2.6.6 final, my current schedule looks like:
> 
> * Python 2.6.6 rc 1 on Monday 2010-08-02
> * Python 2.6.6 final on Monday 2010-08-16

That would barely work for me. If schedule slips in any way, we'll have
to move the release into end-of-September (but the days as proposed are
fine).

Regards,
Martin

From glyph at twistedmatrix.com  Fri Jun 25 22:43:55 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Fri, 25 Jun 2010 16:43:55 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
Message-ID: <96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>


On Jun 24, 2010, at 4:59 PM, Guido van Rossum wrote:

> Regarding the proposal of a String ABC, I hope this isn't going to
> become a backdoor to reintroduce the Python 2 madness of allowing
> equivalency between text and bytes for *some* strings of bytes and not
> others.

For my part, what I want out of a string ABC is simply the ability to do application-specific optimizations.

There are many applications where all input and output is text, but _must_ be UTF-8.  Even GTK uses UTF-8 as its native text representation, so "output" could just be display.

Right now, in Python 3, the only way to be "correct" about this is to copy every byte of input into 4 bytes of output, then copy each code point *back* into a single byte of output.  If all your application does is rewrite the occasional XML attribute, for example, this cost can be significant, if not overwhelming.

I'd like a version of 'decode' which would give me a type that was, in every respect, unicode, and responded to all protocols exactly as other unicode objects (or "str objects", if you prefer py3 nomenclature ;-)) do, but wouldn't actually copy any of that memory unless it really needed to (for example, to pass to a C API that expected native wide characters), and that would hold on to the original bytes so that it could produce them on demand if encoded to the same encoding again. So, as others in this thread have mentioned, the 'ABC' really implies some stuff about C APIs as well.

I'm not sure about the exact performance impact of such a class, which is why I'd like the ability to implement it *outside* of the stdlib and see how it works on a project, and return with a proposal along with some data.  There are also different ways to implement this, and other optimizations (like ropes) which might be better.

You can almost do this today, but the lack of things like the hypothetical "__rcontains__" does make it impossible to be totally transparent about it.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/03c129cf/attachment.html>

From barry at python.org  Fri Jun 25 22:59:44 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 25 Jun 2010 16:59:44 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23D006.6080800@netwok.org>
References: <20100624115048.4fd152e3@heresy> <4C23A901.7060100@netwok.org>
	<20100624172302.024687ef@heresy> <4C23D006.6080800@netwok.org>
Message-ID: <20100625165944.2cac0053@heresy>

On Jun 24, 2010, at 11:37 PM, ?ric Araujo wrote:

>Your plan seems good. Adding keyword arguments should not create
>compatibility issues, and I suspect the impact on the code of build_ext
>may be actually quite small. I?ll try to review your patch even though I
>don?t know C or compiler oddities, but Tarek will have the best insight
>and the final word.

The C and configure/Makefile bits are pretty trivial.  It basically extends
the list of shared library extensions searched for on *nix machines, and
allows that to be set on the ./configure command.

As for the impact on distutils, with updated tests, it's less than 100 lines
of diff.  Again there it essentially allows us to pass the extension that
build_ext writes to from the setup.py, via the Extension class.

Because distutil's default is to use the $SO variable from the
system-installed Makefile, with the change to dynload_shlib.c, configure.in,
and Makefile.pre.in, we would get distutils writing the versioned .so files
for free.  I'll note further that if you *don't* specify this to ./configure,
nothing much changes[1].

The distutils part of the patch is only there to disable or override the
default, and *that's* only there to support proposed semantics that foo.so be
used for PEP 384-compliant ABI extension modules.

IOW, until PEP 384 is actually implemented, the distutils part of the patch is
unnecessary.  However, if the other changes are accepted, then I will add a
discussion of this issue to PEP 384, and we can figure out the best semantics
and implementation at that point.  I honestly don't know if I am going to get
to work on PEP 384 before 3.2 beta.

>In case the time machine?s not available, your suggestion about getting
>the filename from the Extension instance instead of passing in a string
>can most certainly land in distutils2.

Cool.

-Barry

[1] Well, I now realize you'll get an extra useless stat call, but I will fix
that.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/72010618/attachment.pgp>

From guido at python.org  Fri Jun 25 23:02:05 2010
From: guido at python.org (Guido van Rossum)
Date: Fri, 25 Jun 2010 14:02:05 -0700
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com> 
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com> 
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
Message-ID: <AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>

On Fri, Jun 25, 2010 at 1:43 PM, Glyph Lefkowitz
<glyph at twistedmatrix.com> wrote:
>
> On Jun 24, 2010, at 4:59 PM, Guido van Rossum wrote:
>
> Regarding the proposal of a String ABC, I hope this isn't going to
> become a backdoor to reintroduce the Python 2 madness of allowing
> equivalency between text and bytes for *some* strings of bytes and not
> others.
>
> For my part, what I want out of a string ABC is simply the ability to do
> application-specific optimizations.
> There are many applications where all input and output is text, but _must_
> be UTF-8. ?Even GTK uses UTF-8 as its native text representation, so
> "output" could just be display.
> Right now, in Python 3, the only way to be "correct" about this is to copy
> every byte of input into 4 bytes of output, then copy each code point *back*
> into a single byte of output. ?If all your application does is rewrite the
> occasional XML attribute, for example, this cost can be significant, if not
> overwhelming.
> I'd like a version of 'decode' which would give me a type that was, in every
> respect, unicode, and responded to all protocols exactly as other
> unicode?objects?(or "str objects", if you prefer py3 nomenclature ;-)) do,
> but wouldn't actually copy any of that memory unless it really needed to
> (for example, to pass to a C API that expected native wide characters), and
> that would hold on to the original bytes so that it could produce them on
> demand if encoded to the same encoding again.?So, as others in this thread
> have mentioned, the 'ABC' really implies some stuff about C APIs as well.
> I'm not sure about the exact performance impact of such a class, which is
> why I'd like the ability to implement it *outside* of the stdlib and see how
> it works on a project, and return with a proposal along with some data.
> ?There are also different ways to implement this, and other optimizations
> (like ropes) which might be better.
> You can almost do this today, but the lack of things like the hypothetical
> "__rcontains__" does make it impossible to be totally transparent about it.

But you'd still have to validate it, right? You wouldn't want to go on
using what you thought was wrapped UTF-8 if it wasn't actually valid
UTF-8 (or you'd be worse off than in Python 2). So you're really just
worried about space consumption. I'd like to see a lot of hard memory
profiling data before I got overly worried about that.

-- 
--Guido van Rossum (python.org/~guido)

From barry at python.org  Fri Jun 25 23:03:22 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 25 Jun 2010 17:03:22 -0400
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <4C2512A2.1040404@v.loewis.de>
References: <20100625121847.60331d9e@heresy>
	<4C2512A2.1040404@v.loewis.de>
Message-ID: <20100625170322.5ece724f@heresy>

On Jun 25, 2010, at 10:33 PM, Martin v. L?wis wrote:

>Am 25.06.2010 18:18, schrieb Barry Warsaw:
>> Benjamin is still planning to release Python 2.7 final on 2010-07-03, so it's
>> time for me to work out the release schedule for Python 2.6.6 - likely the
>> last maintenance release for Python 2.6.
>> 
>> Because summer schedules are crazy, and I want to leave two weeks between
>> 2.6.6 rc1 and 2.6.6 final, my current schedule looks like:
>> 
>> * Python 2.6.6 rc 1 on Monday 2010-08-02
>> * Python 2.6.6 final on Monday 2010-08-16
>
>That would barely work for me. If schedule slips in any way, we'll have
>to move the release into end-of-September (but the days as proposed are
>fine).

Would that be bad or good (slipping into September)?  I'd like to get a
release out as soon after 2.7 final as possible, but it's an entirely
self-imposed deadline.  There's no reason why we can't push the whole 2.6.6
thing later if that works better for you.  OTOH, I can't go much earlier so if
September is bad for you, then we'll stick to the above dates.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/50a6fe0d/attachment.pgp>

From fuzzyman at voidspace.org.uk  Fri Jun 25 23:06:00 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 25 Jun 2010 22:06:00 +0100
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C251133.2090505@v.loewis.de>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de>
Message-ID: <4C251A38.3090205@voidspace.org.uk>

On 25/06/2010 21:27, "Martin v. L?wis" wrote:
>>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>>> project? There was some discussion of it up-list but the discussion fizzled.
>>>>          
>>> Can you please explain what "this project" is, in the context of your
>>> message? GSoC? GHOP?
>>>        
>> Oh, I thought this was quite clear. I was specifically meaning the large
>> "Python 2 or 3" button on python.org. It would help users who want to know
>> what version of python to use if they had a clear guide as to what version
>> to download.
>>      
> Ah, ok. No, nobody has taken ownership of that project, and likely,
> nobody actually will - unless you volunteer.
>    

What page were we suggesting linking to? IIRC someone made a good start 
in the wiki.

I'll move the discussion to pydotorg-www (still need the question about 
answering) and see if we can get it done.

All the best,

Michael

> Regards,
> Martin
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From martin at v.loewis.de  Fri Jun 25 23:14:53 2010
From: martin at v.loewis.de (=?windows-1252?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 23:14:53 +0200
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C251A38.3090205@voidspace.org.uk>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de> <4C251A38.3090205@voidspace.org.uk>
Message-ID: <4C251C4D.50806@v.loewis.de>

> What page were we suggesting linking to?

I don't think anybody proposed anything specific. Steve Holden
suggested it should go to "reasoned discussion of the
pros and cons as evinced in this thread". Stephen Thorne didn't
propose anything specific but to have a large button.

> I'll move the discussion to pydotorg-www 

I'll predict that this is its death :-(

Regards,
Martin


From martin at v.loewis.de  Fri Jun 25 23:16:23 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Fri, 25 Jun 2010 23:16:23 +0200
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <20100625170322.5ece724f@heresy>
References: <20100625121847.60331d9e@heresy>	<4C2512A2.1040404@v.loewis.de>
	<20100625170322.5ece724f@heresy>
Message-ID: <4C251CA7.3070902@v.loewis.de>

> Would that be bad or good (slipping into September)?  I'd like to get a
> release out as soon after 2.7 final as possible, but it's an entirely
> self-imposed deadline.  There's no reason why we can't push the whole 2.6.6
> thing later if that works better for you.  OTOH, I can't go much earlier so if
> September is bad for you, then we'll stick to the above dates.

I think we can strive for your original proposal. If it slips, we let it
slip by a month or two.

Regards,
Martin

From fuzzyman at voidspace.org.uk  Fri Jun 25 23:31:45 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 25 Jun 2010 22:31:45 +0100
Subject: [Python-Dev] Creating APIs that work as both decorators and
 context managers
In-Reply-To: <4C24F6F7.4040200@voidspace.org.uk>
References: <4C24F6F7.4040200@voidspace.org.uk>
Message-ID: <4C252041.7000808@voidspace.org.uk>

On 25/06/2010 19:35, Michael Foord wrote:
> Hello all,
>
> I've put a recipe up on the Python cookbook for creating APIs that 
> work as both decorators and context managers and wonder if it would be 
> considered a useful addition to the functools module.
>
> http://code.activestate.com/recipes/577273-decorator-and-context-manager-from-a-single-api/

Actually contextlib would be a much more sensible home for it.

Michael

>
> I wrote this after writing almost identical code the second time for 
> "patch" in the mock module. (The patch decorator can be used as a 
> decorator or as a context manager and I was writing a new variant.) 
> Both py.test and django have similar code in places, so it is not an 
> uncommon pattern.
>
> It is only 40 odd lines (ignore the ugly Python 2 & 3 compatibility 
> hack), so I'm fine with it living on the cookbook - but it is at least 
> slightly fiddly to write and has the added niceness of providing the 
> optional exception handling semantics of __exit__ for decorators as well.
>
> Example use (really hope email doesn't swallow the whitespace - my 
> apologies in advance if it does):
>
> from context import Context
>
> class mycontext(Context):
>     def __init__(self, *args):
>         """Normal initialiser"""
>
>     def start(self):
>         """
>         Called on entering the with block or starting the decorated 
> function.
>
>         If used in a with statement whatever this method returns will 
> be the
>         context manager.
>         """
>
>     def finish(self, *exc):
>         """
>         Called on exit. Arguments and return value of this method have
>         the same meaning as the __exit__ method of a normal context
>         manager.
>         """
>
> @mycontext('some', 'args')
> def function():
>     pass
>
> with mycontext('some', 'args') as something:
>     pass
>
> I'm not entirely happy with the name of the class or the start and 
> finish methods, so open to suggestions there. start and finish *could* 
> be __enter__ and __exit__ - but that would make the class you 
> implement *look* like a normal context manager and I thought it was 
> better to distinguish them. Perhaps before and after?
>
> All the best,
>
> Michael Foord
> -- 
> http://www.ironpythoninaction.com/
> http://www.voidspace.org.uk/blog
>
> READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.	
>
>    
>
>
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies ("BOGUS AGREEMENTS") that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/fdd31bc6/attachment.html>

From glyph at twistedmatrix.com  Fri Jun 25 23:40:34 2010
From: glyph at twistedmatrix.com (Glyph Lefkowitz)
Date: Fri, 25 Jun 2010 17:40:34 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
Message-ID: <51EFE211-DBCA-497E-9BC5-CC0D2256173E@twistedmatrix.com>


On Jun 25, 2010, at 5:02 PM, Guido van Rossum wrote:

> But you'd still have to validate it, right? You wouldn't want to go on
> using what you thought was wrapped UTF-8 if it wasn't actually valid
> UTF-8 (or you'd be worse off than in Python 2). So you're really just
> worried about space consumption.

So, yes, I am mainly worried about memory consumption, but don't underestimate the pure CPU cost of doing all the copying.  It's quite a bit faster to simply scan through a string than to scan and while you're scanning, keep faulting out the L2 cache while you're accessing some other area of memory to store the copy.

Plus, If I am decoding with the surrogateescape error handler (or its effective equivalent), then no, I don't need to validate it in advance; interpretation can be done lazily as necessary.  I realize that this is just GIGO, but I wouldn't be doing this on data that didn't have an explicitly declared or required encoding in the first place.

> I'd like to see a lot of hard memory profiling data before I got overly worried about that.


I know of several Python applications that are already constrained by memory.  I don't have a lot of hard memory profiling data, but in an environment where you're spawning as many processes as you can in order to consume _all_ the physically available RAM for string processing, it stands to reason that properly decoding everything and thereby exploding everything out into 4x as much data (or 2x, if you're lucky) would result in a commensurate decrease in throughput.

I don't think I could even reasonably _propose_ that such a project stop treating textual data as bytes, because there's no optimization strategy once that sort of architecture has been put into place. If your function says "this takes unicode", then you just have to bite the bullet and decode it, or rewrite it again to have a different requirement.

So, right now, I don't know where I'd get the data with to make the argument in the first place :).  If there were some abstraction in the core's treatment of strings, though, and I could decode things and note their encoding without immediately paying this cost (or alternately, paying the cost to see if it's so bad, but with the option of managing it or optimizing it separately).  This is why I'm asking for a way for me to implement my own string type, and not for a change of behavior or an optimization in the stdlib itself: I could be wrong, I don't have a particularly high level of certainty in my performance estimates, but I think that my concerns are realistic enough that I don't want to embark on a big re-architecture of text-handling only to have it become a performance nightmare that needs to be reverted.

As Robert Collins pointed out, they already have performance issues related to encoding in Bazaar.  I know they've done a lot of profiling in that area, so I hope eventually someone from that project will show up with some data to demonstrate it :).  And I've definitely heard many, many anecdotes (some of them in this thread) about people distorting their data structures in various ways to avoid paying decoding cost in the ASCII/latin1 case, whether it's *actually* a significant performance issue or not.  I would very much like to tell those people "Just call .decode(), and if it turns out to actually be a performance issue, you can always deal with it later, with a custom string type."  I'm confident that in *most* cases, it would not be.

Anyway, this may be a serious issue, but I increasingly feel like I'm veering into python-ideas territory, so perhaps I'll just have to burn this bridge when I come to it.  Hopefully after the moratorium.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/36eebbf7/attachment.html>

From barry at python.org  Fri Jun 25 23:53:06 2010
From: barry at python.org (Barry Warsaw)
Date: Fri, 25 Jun 2010 17:53:06 -0400
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <4C251CA7.3070902@v.loewis.de>
References: <20100625121847.60331d9e@heresy> <4C2512A2.1040404@v.loewis.de>
	<20100625170322.5ece724f@heresy> <4C251CA7.3070902@v.loewis.de>
Message-ID: <20100625175306.6fa9e1eb@heresy>

On Jun 25, 2010, at 11:16 PM, Martin v. L?wis wrote:

>> Would that be bad or good (slipping into September)?  I'd like to get a
>> release out as soon after 2.7 final as possible, but it's an entirely
>> self-imposed deadline.  There's no reason why we can't push the whole 2.6.6
>> thing later if that works better for you.  OTOH, I can't go much earlier so if
>> September is bad for you, then we'll stick to the above dates.
>
>I think we can strive for your original proposal. If it slips, we let it
>slip by a month or two.

Cool, thanks Martin.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/db40ac0c/attachment.pgp>

From fuzzyman at voidspace.org.uk  Fri Jun 25 23:53:29 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Fri, 25 Jun 2010 22:53:29 +0100
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C251C4D.50806@v.loewis.de>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de> <4C251A38.3090205@voidspace.org.uk>
	<4C251C4D.50806@v.loewis.de>
Message-ID: <4C252559.5060800@voidspace.org.uk>

On 25/06/2010 22:14, "Martin v. L?wis" wrote:
>> What page were we suggesting linking to?
>>      
> I don't think anybody proposed anything specific. Steve Holden
> suggested it should go to "reasoned discussion of the
> pros and cons as evinced in this thread". Stephen Thorne didn't
> propose anything specific but to have a large button.
>
>    

Earlier in this discussion *someone* did start a page on the wiki, with 
this use case in mind... You forced me to actually look it up:

     http://wiki.python.org/moin/Python2orPython3

>> I'll move the discussion to pydotorg-www
>>      
> I'll predict that this is its death :-(
>    

Heh.

Michael

> Regards,
> Martin
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From tseaver at palladion.com  Sat Jun 26 00:12:10 2010
From: tseaver at palladion.com (Tres Seaver)
Date: Fri, 25 Jun 2010 18:12:10 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
Message-ID: <i039jr$h8$1@dough.gmane.org>

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Guido van Rossum wrote:

> But you'd still have to validate it, right? You wouldn't want to go on
> using what you thought was wrapped UTF-8 if it wasn't actually valid
> UTF-8 (or you'd be worse off than in Python 2). So you're really just
> worried about space consumption. I'd like to see a lot of hard memory
> profiling data before I got overly worried about that.

I do know for a fact that using a UCS2-compiled Python instead of the
system's UCS4-compiled Python leads to measurable, noticable drop in
memory consumption of long-running webserver processes using Unicode
(Zope, repoze.bfg, etc).  We routinely build Python from source for
deployments precisely because of this fact (in part -- the absurd
choices made by packagers to exclude crucial bits on various pltaforms
is the other part).


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwlKbQACgkQ+gerLs4ltQ4TfACdHgLXPHeGw42GidhQdzABkQaR
+nEAoLE1sd+g1aJuxSn6swvvX0g52EU4
=MSwx
-----END PGP SIGNATURE-----


From ianb at colorstudy.com  Sat Jun 26 00:26:20 2010
From: ianb at colorstudy.com (Ian Bicking)
Date: Fri, 25 Jun 2010 17:26:20 -0500
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com> 
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com> 
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
Message-ID: <AANLkTinzi4upWoxccUT_5hGxuFu71yxBy_K-ejzXe6uV@mail.gmail.com>

On Fri, Jun 25, 2010 at 4:02 PM, Guido van Rossum <guido at python.org> wrote:

> On Fri, Jun 25, 2010 at 1:43 PM, Glyph Lefkowitz
> > I'd like a version of 'decode' which would give me a type that was, in
> every
> > respect, unicode, and responded to all protocols exactly as other
> > unicode objects (or "str objects", if you prefer py3 nomenclature ;-))
> do,
> > but wouldn't actually copy any of that memory unless it really needed to
> > (for example, to pass to a C API that expected native wide characters),
> and
> > that would hold on to the original bytes so that it could produce them on
> > demand if encoded to the same encoding again. So, as others in this
> thread
> > have mentioned, the 'ABC' really implies some stuff about C APIs as well.
> > I'm not sure about the exact performance impact of such a class, which is
> > why I'd like the ability to implement it *outside* of the stdlib and see
> how
> > it works on a project, and return with a proposal along with some data.
> >  There are also different ways to implement this, and other optimizations
> > (like ropes) which might be better.
> > You can almost do this today, but the lack of things like the
> hypothetical
> > "__rcontains__" does make it impossible to be totally transparent about
> it.
>
> But you'd still have to validate it, right? You wouldn't want to go on
> using what you thought was wrapped UTF-8 if it wasn't actually valid
> UTF-8 (or you'd be worse off than in Python 2). So you're really just
> worried about space consumption. I'd like to see a lot of hard memory
> profiling data before I got overly worried about that.
>

It wasn't my profiling, but I seem to recall that Fredrik Lundh specifically
benchmarked ElementTree with all-unicode and sometimes-ascii-bytes, and
found that using Python 2 strs in some cases provided notable advantages.  I
know Stefan copied ElementTree in this regard in lxml, maybe he also did a
benchmark or knows of one?

-- 
Ian Bicking  |  http://blog.ianbicking.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100625/607e5b2e/attachment.html>

From pje at telecommunity.com  Sat Jun 26 00:27:04 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Fri, 25 Jun 2010 18:27:04 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100625222722.594D23A4099@sparrow.telecommunity.com>

At 01:18 AM 6/26/2010 +0900, Stephen J. Turnbull wrote:
>It seems to me what is wanted here is something like Perl's taint
>mechanism, for *both* kinds of strings.  Am I missing something?

You could certainly view it as a kind of tainting.  The part where 
the type would be bytes-based is indeed somewhat incidental to the 
actual use case -- it's just that if you already have the bytes, and 
all you want to do is tag them (e.g. the WSGI headers case), the 
extra encoding step seems pointless.

A string coercion protocol (that would be used by .join(), .format(), 
__contains__, __mod__, etc.) would allow you to do whatever sort of 
tainted-string or tainted-bytes implementations one might wish to 
have.  I suppose that tainting user inputs (as in Perl) would be just 
as useful of an application of the same coercion protocol.

Actually, I have another use case for this custom string coercion, 
which is that I once wrote a string subclass whose purpose was to 
track the original file and line number of some text.  Even though 
only my code was manipulating the strings, it was very difficult to 
get the tainting to work correctly without extreme care as to the 
string methods used.  (For example, I had to use string addition 
rather than %-formatting.)


>But with your architecture, it seems to me that you actually don't
>want polymorphic functions in the stdlib.  You want the stdlib
>functions to be bytes-oriented if and only if they are reliable.  (This
>is what I was saying to Guido elsewhere.)

I'm not sure I follow you.  What I want is for the stdlib to create 
stringlike objects of a type determined by the types of the inputs -- 
where the logic for deciding this coercion can be controlled by the 
input objects' types, rather than putting this in the hands of the 
stdlib function.

And of course, this applies to non-stdlib functions, too -- anything 
that simply manipulates user-defined string classes, should allow the 
user-defined classes to determine the coercion of the result.

>BTW, this was a little unclear to me:
>
>  > [Collisions will] be with other *unicode* strings.  Ones coming
>  > from other code, and literals embedded in the stdlib.
>
>What about the literals in the stdlib?  Are you saying they contain
>invalid code points for your known output encoding?  Or are you saying
>that with non-polymorphic unicode stdlib, you get lots of false
>positives when combining with your validated bytes?

No, I mean that the current string coercion rules cause everything to 
be converted to unicode, thereby discarding the tainting information, 
so to speak.  This applies equally to other tainting use cases, and 
other uses for custom stringlike objects.


From steve at holdenweb.com  Sat Jun 26 00:38:38 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 18:38:38 -0400
Subject: [Python-Dev] Signs of neglect?
Message-ID: <i03b71$538$1@dough.gmane.org>

I was pretty stunned when I tried this. Remember that the Tools
subdirectory is distributed with Windows, so this means we got through
almost two releases without anyone realizing that 2to3 does not appear
to have touched this code.

Yes, I have: http://bugs.python.org/issue9083

When's 3.2 due out?

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From janssen at parc.com  Sat Jun 26 00:40:52 2010
From: janssen at parc.com (Bill Janssen)
Date: Fri, 25 Jun 2010 15:40:52 PDT
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
Message-ID: <26215.1277505652@parc.com>

Guido van Rossum <guido at python.org> wrote:

> On Fri, Jun 25, 2010 at 1:43 PM, Glyph Lefkowitz
> <glyph at twistedmatrix.com> wrote:
> >
> > On Jun 24, 2010, at 4:59 PM, Guido van Rossum wrote:
> >
> > Regarding the proposal of a String ABC, I hope this isn't going to
> > become a backdoor to reintroduce the Python 2 madness of allowing
> > equivalency between text and bytes for *some* strings of bytes and not
> > others.

I never actually replied to this...  Absolutely right, which is why you
might really want another kind of string, rather than a way to treat
some bytes values as strings in some places.  Both Python 2 and Python 3
are missing one of the three types.  Python 1 and 2 didn't have "bytes",
and this caused problems because "str" was pressed into use to hold
arbitrary byte sequences.  (Python 2 "str" has other problems as well,
like losing track of the encoding.)  Python 3 doesn't have Python 2's
"str" (encoded string), and bytes are being pressed into use for that.
Each of these uses is an ad hoc hijack of an inappropriate type, and
additional frameworks not directly supported by the Python language are
being jury-rigged to try to support the uses.

On the other hand, this is all in the eye of the beholder.  Both byte
sequences and strings are horrible formless things; they remind me of
BLISS.  You seldom really have a byte sequence; what you have is an XDR
float or an encoded string or an IP header or an email message.
Similarly for strings; they are really file names or city names or
English sentences or URIs or other things with significant semantic
constraints not captured by the typical type system.  So, yes, there
*is* an inescapable equivalency between text and bytes for *some*
sequences of bytes (those that represent encoded strings) and not others
(those that contain the XDR float, for instance).  Creating a separate
encoded string type would be one way to keep that straight.

> > For my part, what I want out of a string ABC is simply the ability to do
> > application-specific optimizations.
> > There are many applications where all input and output is text, but _must_
> > be UTF-8. ?Even GTK uses UTF-8 as its native text representation, so
> > "output" could just be display.
> > Right now, in Python 3, the only way to be "correct" about this is to copy
> > every byte of input into 4 bytes of output, then copy each code point *back*
> > into a single byte of output. ?If all your application does is rewrite the
> > occasional XML attribute, for example, this cost can be significant, if not
> > overwhelming.
> > I'd like a version of 'decode' which would give me a type that was, in every
> > respect, unicode, and responded to all protocols exactly as other
> > unicode?objects?(or "str objects", if you prefer py3 nomenclature ;-)) do,
> > but wouldn't actually copy any of that memory unless it really needed to
> > (for example, to pass to a C API that expected native wide characters), and
> > that would hold on to the original bytes so that it could produce them on
> > demand if encoded to the same encoding again.?So, as others in this thread
> > have mentioned, the 'ABC' really implies some stuff about C APIs as well.

Seems like it.

> > I'm not sure about the exact performance impact of such a class, which is
> > why I'd like the ability to implement it *outside* of the stdlib and see how
> > it works on a project, and return with a proposal along with some data.

Yes, exactly.

> > ?There are also different ways to implement this, and other optimizations
> > (like ropes) which might be better.
> > You can almost do this today, but the lack of things like the hypothetical
> > "__rcontains__" does make it impossible to be totally transparent about it.
> 
> But you'd still have to validate it, right? You wouldn't want to go on
> using what you thought was wrapped UTF-8 if it wasn't actually valid
> UTF-8 (or you'd be worse off than in Python 2).

Yes, but there are different ways to validate it that have different
performance impacts.  Simply trusting the source of the string, for
example, would be appropriate in some cases.

> So you're really just worried about space consumption. I'd like to see
> a lot of hard memory profiling data before I got overly worried about
> that.

While I've seen some big Web pages, I think the email folks, who often
have to process messages with attachments measuring in the tens of
megabytes, have the stronger problems here, and I think speed may be
more important than memory.  I've built both a Web server and an IMAP
server in Python, and the IMAP server is where the issues of storage
management really prevail.  If you have to convert a 20 MB encoded
string into a Unicode string just to look at the headers as strings, you
have issues.  (The Python email package doesn't do that, by the way.)

Bill

From steve at holdenweb.com  Sat Jun 26 00:51:53 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 18:51:53 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <51EFE211-DBCA-497E-9BC5-CC0D2256173E@twistedmatrix.com>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<51EFE211-DBCA-497E-9BC5-CC0D2256173E@twistedmatrix.com>
Message-ID: <i03bvs$6tk$1@dough.gmane.org>

Glyph Lefkowitz wrote:
> 
> On Jun 25, 2010, at 5:02 PM, Guido van Rossum wrote:
> 
>> But you'd still have to validate it, right? You wouldn't want to go on
>> using what you thought was wrapped UTF-8 if it wasn't actually valid
>> UTF-8 (or you'd be worse off than in Python 2). So you're really just
>> worried about space consumption.
> 
> So, yes, I am mainly worried about memory consumption, but don't
> underestimate the pure CPU cost of doing all the copying.  It's quite a
> bit faster to simply scan through a string than to scan and while you're
> scanning, keep faulting out the L2 cache while you're accessing some
> other area of memory to store the copy.
> 
Yes, but you are already talking about optimizations that might be
significant for large-ish strings (where large-ish depends on exactly
where Moore's Law is currently delivering computational performance) -
the amount of cache consumed by a ten-byte string will slip by
unnoticed, but at L2 levels megabytes would effectively flush the cache.

> Plus, If I am decoding with the surrogateescape error handler (or its
> effective equivalent), then no, I don't need to validate it in advance;
> interpretation can be done lazily as necessary.  I realize that this is
> just GIGO, but I wouldn't be doing this on data that didn't have an
> explicitly declared or required encoding in the first place.
> 
>> I'd like to see a lot of hard memory profiling data before I got
>> overly worried about that.
> 
> I know of several Python applications that are already constrained by
> memory.  I don't have a lot of hard memory profiling data, but in an
> environment where you're spawning as many processes as you can in order
> to consume _all_ the physically available RAM for string processing, it
> stands to reason that properly decoding everything and thereby exploding
> everything out into 4x as much data (or 2x, if you're lucky) would
> result in a commensurate decrease in throughput.
> 
Yes, UCS-4's impact does seem like to could be horrible for these use
cases. But "knowing of several Python applications that are already
constrained by memory" doesn't mean that it's a bad general decision.
Most users will never notice the difference, so we should try to
accommodate those who do notice a difference without inconveniencing the
rest too much.

> I don't think I could even reasonably _propose_ that such a project stop
> treating textual data as bytes, because there's no optimization strategy
> once that sort of architecture has been put into place. If your function
> says "this takes unicode", then you just have to bite the bullet and
> decode it, or rewrite it again to have a different requirement.
> 
That has always been my understanding. I regard it as a sort of
intellectual tax on the United States (and its Western collaborators)
for being too dim to realise that eventually they would end up selling
computers to people with more than 256 characters in their alphabet).
Sorry guys, but your computers are only as fast as you think they are
when you only talk to each other.

> So, right now, I don't know where I'd get the data with to make the
> argument in the first place :).  If there were some abstraction in the
> core's treatment of strings, though, and I could decode things and note
> their encoding without immediately paying this cost (or alternately,
> paying the cost to see if it's so bad, but with the option of managing
> it or optimizing it separately).  This is why I'm asking for a way for
> me to implement my own string type, and not for a change of behavior or
> an optimization in the stdlib itself: I could be wrong, I don't have a
> particularly high level of certainty in my performance estimates, but I
> think that my concerns are realistic enough that I don't want to embark
> on a big re-architecture of text-handling only to have it become a
> performance nightmare that needs to be reverted.
> 
Recent experience with the thoroughness of the Python 3 release
preparations leads me to believe that *anything* new needs to prove its
worth outside the stdlib for a while.

> As Robert Collins pointed out, they already have performance issues
> related to encoding in Bazaar.  I know they've done a lot of profiling
> in that area, so I hope eventually someone from that project will show
> up with some data to demonstrate it :).  And I've definitely heard many,
> many anecdotes (some of them in this thread) about people distorting
> their data structures in various ways to avoid paying decoding cost in
> the ASCII/latin1 case, whether it's *actually* a significant performance
> issue or not.  I would very much like to tell those people "Just call
> .decode(), and if it turns out to actually be a performance issue, you
> can always deal with it later, with a custom string type."  I'm
> confident that in *most* cases, it would not be.
> 
Well that would be a nice win.

> Anyway, this may be a serious issue, but I increasingly feel like I'm
> veering into python-ideas territory, so perhaps I'll just have to burn
> this bridge when I come to it.  Hopefully after the moratorium.
> 
Sounds like it's worth pursuing, though. I mean after all, we don't want
to leave *all* the bit-twiddling to the low-level language users ;-).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Sat Jun 26 00:57:10 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 18:57:10 -0400
Subject: [Python-Dev] docs - Copy
In-Reply-To: <4C2511EA.3000200@v.loewis.de>
References: <AANLkTinwEPLg-t5VJMBRilP4T7MFKyRJaiHTc5Mj_pM9@mail.gmail.com>	<i02n6d$1ao$2@dough.gmane.org>
	<4C2511EA.3000200@v.loewis.de>
Message-ID: <i03c9q$6tk$4@dough.gmane.org>

Martin v. L?wis wrote:
> Am 25.06.2010 18:57, schrieb Terry Reedy:
>> On 6/24/2010 8:51 PM, Rich Healey wrote:
>>> http://docs.python.org/library/copy.html
>> Discussion of the wording of current docs should go to python-list.
>> Py-dev is for development of future Python.
> 
> No no no. [...]

It isn't always easy to tell, but I think Martin meant "no".

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Sat Jun 26 00:54:24 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 18:54:24 -0400
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <4C2512A2.1040404@v.loewis.de>
References: <20100625121847.60331d9e@heresy> <4C2512A2.1040404@v.loewis.de>
Message-ID: <i03c4j$6tk$2@dough.gmane.org>

Martin v. L?wis wrote:
> Am 25.06.2010 18:18, schrieb Barry Warsaw:
>> Benjamin is still planning to release Python 2.7 final on 2010-07-03, so it's
>> time for me to work out the release schedule for Python 2.6.6 - likely the
>> last maintenance release for Python 2.6.
>>
>> Because summer schedules are crazy, and I want to leave two weeks between
>> 2.6.6 rc1 and 2.6.6 final, my current schedule looks like:
>>
>> * Python 2.6.6 rc 1 on Monday 2010-08-02
>> * Python 2.6.6 final on Monday 2010-08-16
> 
> That would barely work for me. If schedule slips in any way, we'll have
> to move the release into end-of-September (but the days as proposed are
> fine).
> 
> Regards,
> Martin

A six-week slippage wouldn't be good. What's the relevant chaos theory
when a one- or two-day hold leads to a six-week delivery slippage?

Let's hope things don't slip!

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Sat Jun 26 01:00:19 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 19:00:19 -0400
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C251133.2090505@v.loewis.de>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de>
Message-ID: <4C253503.6080300@holdenweb.com>

Martin v. L?wis wrote:
>>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>>> project? There was some discussion of it up-list but the discussion fizzled.
>>> Can you please explain what "this project" is, in the context of your
>>> message? GSoC? GHOP?
>> Oh, I thought this was quite clear. I was specifically meaning the large
>> "Python 2 or 3" button on python.org. It would help users who want to know
>> what version of python to use if they had a clear guide as to what version
>> to download.
> 
> Ah, ok. No, nobody has taken ownership of that project, and likely,
> nobody actually will - unless you volunteer.
> 
Or perhaps spur the pydotorg community on with some well-placed
encouragement. Nobody ever seems to say "thanks" to those guys except
the jobs posters - *they* seem pretty happy.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Sat Jun 26 00:55:16 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 18:55:16 -0400
Subject: [Python-Dev] Schedule for Python 2.6.6
In-Reply-To: <4C251CA7.3070902@v.loewis.de>
References: <20100625121847.60331d9e@heresy>	<4C2512A2.1040404@v.loewis.de>	<20100625170322.5ece724f@heresy>
	<4C251CA7.3070902@v.loewis.de>
Message-ID: <i03c67$6tk$3@dough.gmane.org>

Martin v. L?wis wrote:
>> Would that be bad or good (slipping into September)?  I'd like to get a
>> release out as soon after 2.7 final as possible, but it's an entirely
>> self-imposed deadline.  There's no reason why we can't push the whole 2.6.6
>> thing later if that works better for you.  OTOH, I can't go much earlier so if
>> September is bad for you, then we'll stick to the above dates.
> 
> I think we can strive for your original proposal. If it slips, we let it
> slip by a month or two.
> 
> Regards,
> Martin

I suppose for 2..6. it's not really critical.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Sat Jun 26 01:00:19 2010
From: steve at holdenweb.com (Steve Holden)
Date: Fri, 25 Jun 2010 19:00:19 -0400
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C251133.2090505@v.loewis.de>
References: <20100624232821.GB10805@thorne.id.au>	<4C23F1AD.9040809@v.loewis.de>	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de>
Message-ID: <4C253503.6080300@holdenweb.com>

Martin v. L?wis wrote:
>>>> I am extremely keen for this to happen. Does anyone have ownership of this
>>>> project? There was some discussion of it up-list but the discussion fizzled.
>>> Can you please explain what "this project" is, in the context of your
>>> message? GSoC? GHOP?
>> Oh, I thought this was quite clear. I was specifically meaning the large
>> "Python 2 or 3" button on python.org. It would help users who want to know
>> what version of python to use if they had a clear guide as to what version
>> to download.
> 
> Ah, ok. No, nobody has taken ownership of that project, and likely,
> nobody actually will - unless you volunteer.
> 
Or perhaps spur the pydotorg community on with some well-placed
encouragement. Nobody ever seems to say "thanks" to those guys except
the jobs posters - *they* seem pretty happy.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From benjamin at python.org  Sat Jun 26 01:23:02 2010
From: benjamin at python.org (Benjamin Peterson)
Date: Fri, 25 Jun 2010 18:23:02 -0500
Subject: [Python-Dev] Signs of neglect?
In-Reply-To: <i03b71$538$1@dough.gmane.org>
References: <i03b71$538$1@dough.gmane.org>
Message-ID: <AANLkTin0jc13wW87qZouchtfLkyHEdaeJZ_F2vQowKam@mail.gmail.com>

2010/6/25 Steve Holden <steve at holdenweb.com>:
> I was pretty stunned when I tried this. Remember that the Tools
> subdirectory is distributed with Windows, so this means we got through
> almost two releases without anyone realizing that 2to3 does not appear
> to have touched this code.

I would call it more a sign of no tests rather than one of neglect and
perhaps also an indication of the usefulness of those tools.

>
> Yes, I have: http://bugs.python.org/issue9083
>
> When's 3.2 due out?

PEP 392.


-- 
Regards,
Benjamin

From fijall at gmail.com  Sat Jun 26 01:27:52 2010
From: fijall at gmail.com (Maciej Fijalkowski)
Date: Fri, 25 Jun 2010 17:27:52 -0600
Subject: [Python-Dev] PyPy 1.3 released
Message-ID: <AANLkTikaN3p6BNFUfXL4RlWB28ZwjrajeUb34r8SGvdy@mail.gmail.com>

=======================
PyPy 1.3: Stabilization
=======================

Hello.

We're please to announce release of PyPy 1.3. This release has two major
improvements. First of all, we stabilized the JIT compiler since 1.2 release,
answered user issues, fixed bugs, and generally improved speed.

We're also pleased to announce alpha support for loading CPython extension
modules written in C. While the main purpose of this release is increased
stability, this feature is in alpha stage and it is not yet suited for
production environments.

Highlights of this release
==========================

* We introduced support for CPython extension modules written in C. As of now,
  this support is in alpha, and it's very unlikely unaltered C extensions will
  work out of the box, due to missing functions or refcounting details. The
  support is disable by default, so you have to do::

   import cpyext

  before trying to import any .so file. Also, libraries are source-compatible
  and not binary-compatible. That means you need to recompile binaries, using
  for example::

   python setup.py build

  Details may vary, depending on your build system. Make sure you include
  the above line at the beginning of setup.py or put it in your PYTHONSTARTUP.

  This is alpha feature. It'll likely segfault. You have been warned!

* JIT bugfixes. A lot of bugs reported for the JIT have been fixed, and its
  stability greatly improved since 1.2 release.

* Various small improvements have been added to the JIT code, as well as a great
  speedup of compiling time.

Cheers,
Maciej Fijalkowski, Armin Rigo, Alex Gaynor, Amaury Forgeot d'Arc and
the PyPy team

From ncoghlan at gmail.com  Sat Jun 26 02:19:51 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 26 Jun 2010 10:19:51 +1000
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
	<A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
Message-ID: <AANLkTilDuY2bWXY8Yc_FaICCknv-zHsQox1F6n3ybREt@mail.gmail.com>

On Sat, Jun 26, 2010 at 6:12 AM, James Y Knight <foom at fuhm.net> wrote:
> However, then you have to also consider python packages made up of multiple
> distro packages -- like twisted or zope. Twisted includes some C extensions
> in the core package. But then there are other twisted modules (installed
> under a "twisted.foo" name) which do not include C extensions. If the base
> twisted package is installed under a version-specific directory, then all of
> the submodule packages need to also be installed under the same
> version-specific directory (and thus built for all versions).
>
> In the past, it has proven somewhat tricky to coordinate which directory the
> modules for package "foo" should be installed in, because you need to know
> whether *any* of the related packages includes a native ".so" file, not just
> the current package.
>
> The converse situation, where a base package did *not* get installed into a
> version-specific directory because it includes no native code, but a
> submodule *does* include a ".so" file, is even trickier.

I think there are two major ways to tackle this:
- allow multiple versions of a .so file within a single directory (i.e
Barry's current suggestion)
- enhanced namespace packages, allowing a single package to be spread
across multiple directories, some of which may be Python version
specific (i.e. modifications to PEP 382 to support references to
version-specific directories)

I think a new PEP is definitely in order, especially to explain why
enhancing PEP 382 to support saying "look over here for the .so files
for this version" isn't a preferable approach.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at thorne.id.au  Sat Jun 26 02:41:34 2010
From: stephen at thorne.id.au (Stephen Thorne)
Date: Sat, 26 Jun 2010 10:41:34 +1000
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <4C251C4D.50806@v.loewis.de>
References: <20100624232821.GB10805@thorne.id.au>
	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de> <4C251A38.3090205@voidspace.org.uk>
	<4C251C4D.50806@v.loewis.de>
Message-ID: <20100626004134.GB16084@thorne.id.au>

On 2010-06-25, "Martin v. L?wis" wrote:
> > What page were we suggesting linking to?
> 
> I don't think anybody proposed anything specific. Steve Holden
> suggested it should go to "reasoned discussion of the
> pros and cons as evinced in this thread". Stephen Thorne didn't
> propose anything specific but to have a large button.

I didn't propose anything, I heard a good idea that I'd like to see followed
through.

-- 
Regards,
Stephen Thorne

From martin at v.loewis.de  Sat Jun 26 02:49:49 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sat, 26 Jun 2010 02:49:49 +0200
Subject: [Python-Dev] "2 or 3" link on python.org
In-Reply-To: <20100626004134.GB16084@thorne.id.au>
References: <20100624232821.GB10805@thorne.id.au>
	<4C23F1AD.9040809@v.loewis.de>
	<20100625003149.GA16084@thorne.id.au>
	<4C251133.2090505@v.loewis.de> <4C251A38.3090205@voidspace.org.uk>
	<4C251C4D.50806@v.loewis.de> <20100626004134.GB16084@thorne.id.au>
Message-ID: <4C254EAD.4060006@v.loewis.de>

Am 26.06.2010 02:41, schrieb Stephen Thorne:
> On 2010-06-25, "Martin v. L?wis" wrote:
>>> What page were we suggesting linking to?
>>
>> I don't think anybody proposed anything specific. Steve Holden
>> suggested it should go to "reasoned discussion of the
>> pros and cons as evinced in this thread". Stephen Thorne didn't
>> propose anything specific but to have a large button.
> 
> I didn't propose anything, I heard a good idea that I'd like to see followed
> through.

Ah, ok. I thought "I am extremely keen for this to happen" indicated
that you would be willing to volunteer time to make it happen.

Regards,
Martin

From ncoghlan at gmail.com  Sat Jun 26 04:59:31 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sat, 26 Jun 2010 12:59:31 +1000
Subject: [Python-Dev] Signs of neglect?
In-Reply-To: <AANLkTin0jc13wW87qZouchtfLkyHEdaeJZ_F2vQowKam@mail.gmail.com>
References: <i03b71$538$1@dough.gmane.org>
	<AANLkTin0jc13wW87qZouchtfLkyHEdaeJZ_F2vQowKam@mail.gmail.com>
Message-ID: <AANLkTikLhAb6SbLR26MMG94fN70cULZh6SsDaHGwmcnx@mail.gmail.com>

On Sat, Jun 26, 2010 at 9:23 AM, Benjamin Peterson <benjamin at python.org> wrote:
> 2010/6/25 Steve Holden <steve at holdenweb.com>:
> I would call it more a sign of no tests rather than one of neglect and
> perhaps also an indication of the usefulness of those tools.

Less than useful tools with no tests probably qualify as neglected...

An assessment of the contents of the Py3k tools directory is probably
in order, with at least a basic "will it run?" check added for those
we decide to keep..

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From stephen at xemacs.org  Sat Jun 26 05:42:25 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sat, 26 Jun 2010 12:42:25 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100625222722.594D23A4099@sparrow.telecommunity.com>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
Message-ID: <87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:

 > it's just that if you already have the bytes, and all you want to
 > do is tag them (e.g. the WSGI headers case), the extra encoding
 > step seems pointless.

Well, I'll have to concede that unless and until I get involved in the
WSGI development effort.<wink>

 > >But with your architecture, it seems to me that you actually don't
 > >want polymorphic functions in the stdlib.  You want the stdlib
 > >functions to be bytes-oriented if and only if they are reliable.  (This
 > >is what I was saying to Guido elsewhere.)
 > 
 > I'm not sure I follow you.

What I'm saying here is that if bytes are the signal of validity, and
the stdlib functions preserve validity, then it's better to have the
stdlib functions object to unicode data as an argument.  Compare the
alternative: it returns a unicode object which might get passed around
for a while before one of your functions receives it and identifies it
as unvalidated data.

But you agree that there are better mechanisms for validation
(although not available in Python yet), so I don't see this as an
potential obstacle to polymorphism now.

 > What I want is for the stdlib to create stringlike objects of a
 > type determined by the types of the inputs --

In general this is a hard problem, though.  Polymorphism, OK, one-way
tainting OK, but in general combining related types is pretty
arbitrary, and as in the encoded-bytes case, the result type often
varies depending on expectations of callers, not the types of the
data.

From greg.ewing at canterbury.ac.nz  Sat Jun 26 09:58:17 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sat, 26 Jun 2010 19:58:17 +1200
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <i039jr$h8$1@dough.gmane.org>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<i039jr$h8$1@dough.gmane.org>
Message-ID: <4C25B319.8040804@canterbury.ac.nz>

Tres Seaver wrote:

> I do know for a fact that using a UCS2-compiled Python instead of the
> system's UCS4-compiled Python leads to measurable, noticable drop in
> memory consumption of long-running webserver processes using Unicode

Would there be any sanity in having an option to compile
Python with UTF-8 as the internal string representation?

-- 
Greg


From stefan_ml at behnel.de  Sat Jun 26 11:34:56 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 26 Jun 2010 11:34:56 +0200
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <AANLkTinzi4upWoxccUT_5hGxuFu71yxBy_K-ejzXe6uV@mail.gmail.com>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<AANLkTinzi4upWoxccUT_5hGxuFu71yxBy_K-ejzXe6uV@mail.gmail.com>
Message-ID: <i04hk1$qjq$1@dough.gmane.org>

Ian Bicking, 26.06.2010 00:26:
> On Fri, Jun 25, 2010 at 4:02 PM, Guido van Rossum wrote:
>> On Fri, Jun 25, 2010 at 1:43 PM, Glyph Lefkowitz
>>> I'd like a version of 'decode' which would give me a type that was, in
>> every
>>> respect, unicode, and responded to all protocols exactly as other
>>> unicode objects (or "str objects", if you prefer py3 nomenclature ;-))
>> do,
>>> but wouldn't actually copy any of that memory unless it really needed to
>>> (for example, to pass to a C API that expected native wide characters),
>> and
>>> that would hold on to the original bytes so that it could produce them on
>>> demand if encoded to the same encoding again. So, as others in this
>> thread
>>> have mentioned, the 'ABC' really implies some stuff about C APIs as well.

Well, there's the buffer API, so you can already create something that 
refers to an existing C buffer. However, with respect to a string, you will 
have to make sure the underlying buffer doesn't get freed while the string 
is still in use. That will be hard and sometimes impossible to do at the 
C-API level, even if the string is allowed to keep a reference to something 
that holds the buffer.

At least in lxml, such a feature would be completely worthless, as text is 
never held by any ref-counted Python wrapper object. It's only part of the 
XML tree, which is allowed to change at (more or less) any time, so the 
underlying char* buffer could just get freed without further notice. Adding 
a guard against that would likely have a larger impact on the performance 
than the decoding operations.


>>> I'm not sure about the exact performance impact of such a class, which is
>>> why I'd like the ability to implement it *outside* of the stdlib and see
>> how
>>> it works on a project, and return with a proposal along with some data.
>>>   There are also different ways to implement this, and other optimizations
>>> (like ropes) which might be better.
>>> You can almost do this today, but the lack of things like the
>> hypothetical
>>> "__rcontains__" does make it impossible to be totally transparent about
>> it.
>>
>> But you'd still have to validate it, right? You wouldn't want to go on
>> using what you thought was wrapped UTF-8 if it wasn't actually valid
>> UTF-8 (or you'd be worse off than in Python 2). So you're really just
>> worried about space consumption. I'd like to see a lot of hard memory
>> profiling data before I got overly worried about that.
>
> It wasn't my profiling, but I seem to recall that Fredrik Lundh specifically
> benchmarked ElementTree with all-unicode and sometimes-ascii-bytes, and
> found that using Python 2 strs in some cases provided notable advantages.  I
> know Stefan copied ElementTree in this regard in lxml, maybe he also did a
> benchmark or knows of one?

Actually, bytes vs. unicode doesn't make that a big difference in Py2 for 
lxml. ElementTree is a lot older, so I guess it made a larger difference 
when its code was written (and I even think I recall seeing numbers for 
lxml where it seemed to make a notable difference).

In lxml, text content is stored in the C tree of libxml2 as UTF-8 encoded 
char* text. On request, lxml creates a string object from it and returns 
it. In Py2, it checks for plain ASCII content first and returns a byte 
string for that. Only non-ASCII strings are returned as decoded unicode 
strings. In Py3, it always returns unicode strings.

When I run a little benchmark on lxml in Py2.6.5 that just reads some short 
text content from an Element object, I only see a tiny difference between 
unicode strings and byte strings. The gap obviously increases when the text 
gets longer, e.g. when I serialise the complete text content of an XML 
document to either a byte string or a unicode string. But even for 
documents in the megabyte range we are still talking about single 
milliseconds here, and the difference stays well below 10%. It's seriously 
hard to make that the performance bottleneck in an XML application.

Also, since the string objects are only instantiated at request, memory 
isn't an issue either. That's different for (c)ElementTree again, where 
string content is stored as Python objects. Four times the size even for 
plain ASCII strings (e.g. numbers, IDs or even trailing whitespace!) can 
well become a problem there, and can easily dominate the overall size of 
the in-memory tree. Plain ASCII content is surprisingly common in XML 
documents.

Stefan


From stefan_ml at behnel.de  Sat Jun 26 11:41:48 2010
From: stefan_ml at behnel.de (Stefan Behnel)
Date: Sat, 26 Jun 2010 11:41:48 +0200
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <4C25B319.8040804@canterbury.ac.nz>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>	<i039jr$h8$1@dough.gmane.org>
	<4C25B319.8040804@canterbury.ac.nz>
Message-ID: <i04i0s$qjq$2@dough.gmane.org>

Greg Ewing, 26.06.2010 09:58:
> Tres Seaver wrote:
>
>> I do know for a fact that using a UCS2-compiled Python instead of the
>> system's UCS4-compiled Python leads to measurable, noticable drop in
>> memory consumption of long-running webserver processes using Unicode
>
> Would there be any sanity in having an option to compile
> Python with UTF-8 as the internal string representation?

It would break Py_UNICODE, because the internal size of a unicode character 
would no longer be fixed.

Stefan


From steve at holdenweb.com  Sat Jun 26 13:18:37 2010
From: steve at holdenweb.com (Steve Holden)
Date: Sat, 26 Jun 2010 07:18:37 -0400
Subject: [Python-Dev] Signs of neglect?
In-Reply-To: <AANLkTikLhAb6SbLR26MMG94fN70cULZh6SsDaHGwmcnx@mail.gmail.com>
References: <i03b71$538$1@dough.gmane.org>	<AANLkTin0jc13wW87qZouchtfLkyHEdaeJZ_F2vQowKam@mail.gmail.com>
	<AANLkTikLhAb6SbLR26MMG94fN70cULZh6SsDaHGwmcnx@mail.gmail.com>
Message-ID: <4C25E20D.2040007@holdenweb.com>

Nick Coghlan wrote:
> On Sat, Jun 26, 2010 at 9:23 AM, Benjamin Peterson <benjamin at python.org> wrote:
>> 2010/6/25 Steve Holden <steve at holdenweb.com>:
>> I would call it more a sign of no tests rather than one of neglect and
>> perhaps also an indication of the usefulness of those tools.
> 
> Less than useful tools with no tests probably qualify as neglected...
> 
> An assessment of the contents of the Py3k tools directory is probably
> in order, with at least a basic "will it run?" check added for those
> we decide to keep..
> 
Neither webchecker nor wcgui.py will run - the former breaks because
sgmllib is mossing, the latter because it uses the wrong name for
"tkinter" (but overcoming this will throw it bak to an sgmllib
dependency too).

Guido thinks it's OK to abandon at least some of them, so I don't see
the rest getting much love in the future. They do need sorting through -
I don't see anyone wanting xxci.py, for example ("check in files for
which rcsdiff returns nonzero exit status").

But I'm grateful you agree with my diagnosis of neglect (not that a
diagnosis in itself is going to help in fixing things).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From nagle at animats.com  Sat Jun 26 08:11:49 2010
From: nagle at animats.com (John Nagle)
Date: Fri, 25 Jun 2010 23:11:49 -0700
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
 thread usage, and a path to getting rid of the GIL
Message-ID: <4C259A25.1060705@animats.com>

We have just released a proof-of-concept implementation of a new
approach to thread management - "newthreading".  It is available
for download at

     https://sourceforge.net/projects/newthreading/

The user's guide is at

     http://www.animats.com/papers/languages/newthreadingintro.html

This is a pure Python implementation of synchronized objects, along
with a set of restrictions which make programs race-condition free,
even without a Global Interpreter Lock.  The basic idea is that
classes derived from SynchronizedObject are automatically locked
at entry and unlocked at exit. They're also unlocked when a thread
blocks within the class.  So at no time can two threads be active
in such a class at one time.

In addition, only "frozen" objects can be passed in and out of
synchronized objects.  (This is somewhat like the multiprocessing
module, where you can only pass objects that can be "pickled".
But it's not as restrictive; multiple threads can access the
same synchronized object, one at a time.

This pure Python implementation is usable, but does not improve
performance.  It's a proof of concept implementation so that
programmers can try out synchronized classes and see what it's
like to work within those restrictions.

The semantics of Python don't change for single-thread programs.
But when the program forks off the first new thread, the rules
change, and some of the dynamic features of Python are disabled.

Some of the ideas are borrowed from Java, and some are from
"safethreading".  The point is to come up with a set of liveable
restrictions which would allow getting rid of the GIL.  This
is becoming essential as Unladen Swallow starts to work and the
number of processors per machine keeps climbing.

This may in time become a Python Enhancement Proposal.  We'd like
to get some experience with it first. Try it out and report back.
The SourceForge forum for the project is the best place to report problems.

				John Nagle

From arigo at tunes.org  Sat Jun 26 10:34:57 2010
From: arigo at tunes.org (Armin Rigo)
Date: Sat, 26 Jun 2010 10:34:57 +0200
Subject: [Python-Dev] [pypy-dev] PyPy 1.3 released
In-Reply-To: <AANLkTikaN3p6BNFUfXL4RlWB28ZwjrajeUb34r8SGvdy@mail.gmail.com>
References: <AANLkTikaN3p6BNFUfXL4RlWB28ZwjrajeUb34r8SGvdy@mail.gmail.com>
Message-ID: <20100626083457.GA14816@code0.codespeak.net>

Hi,

On Fri, Jun 25, 2010 at 05:27:52PM -0600, Maciej Fijalkowski wrote:
>    python setup.py build

As corrected on the blog (http://morepypy.blogspot.com/), this line
should read:

     pypy setup.py build


Armin.

From fuzzyman at voidspace.org.uk  Sat Jun 26 15:29:24 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Sat, 26 Jun 2010 14:29:24 +0100
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
 thread usage, and a path to getting rid of the GIL
In-Reply-To: <4C259A25.1060705@animats.com>
References: <4C259A25.1060705@animats.com>
Message-ID: <4C2600B4.5020503@voidspace.org.uk>

On 26/06/2010 07:11, John Nagle wrote:
> We have just released a proof-of-concept implementation of a new
> approach to thread management - "newthreading". It is available
> for download at
>
> https://sourceforge.net/projects/newthreading/
>
> The user's guide is at
>
> http://www.animats.com/papers/languages/newthreadingintro.html

The user guide says:

The suggested import is

from newthreading import *

The import * form is considered bad practise in *general* and should not 
be recommended unless there is a good reason. This is slightly off-topic 
for python-dev, although I appreciate that you want feedback with the 
eventual goal of producing a PEP - however the introduction of 
free-threading in Python has not been hampered by lack of 
synchronization primitives but by the difficulty of changing the 
interpreter without unduly impacting single threaded code.

Providing an alternative garbage collection mechanism other than 
reference counting would be a more interesting first-step as far as I 
can see, as that removes the locking required around every access to an 
object (which currently touches the reference count). Introducing 
free-threading by *changing* the threading semantics (so you can't share 
non-frozen objects between threads) would not be acceptable. That 
comment is likely to be based on a misunderstanding of your future 
intentions though. :-)

All the best,

Michael Foord
>
> This is a pure Python implementation of synchronized objects, along
> with a set of restrictions which make programs race-condition free,
> even without a Global Interpreter Lock. The basic idea is that
> classes derived from SynchronizedObject are automatically locked
> at entry and unlocked at exit. They're also unlocked when a thread
> blocks within the class. So at no time can two threads be active
> in such a class at one time.
>
> In addition, only "frozen" objects can be passed in and out of
> synchronized objects. (This is somewhat like the multiprocessing
> module, where you can only pass objects that can be "pickled".
> But it's not as restrictive; multiple threads can access the
> same synchronized object, one at a time.
>
> This pure Python implementation is usable, but does not improve
> performance. It's a proof of concept implementation so that
> programmers can try out synchronized classes and see what it's
> like to work within those restrictions.
>
> The semantics of Python don't change for single-thread programs.
> But when the program forks off the first new thread, the rules
> change, and some of the dynamic features of Python are disabled.
>
> Some of the ideas are borrowed from Java, and some are from
> "safethreading". The point is to come up with a set of liveable
> restrictions which would allow getting rid of the GIL. This
> is becoming essential as Unladen Swallow starts to work and the
> number of processors per machine keeps climbing.
>
> This may in time become a Python Enhancement Proposal. We'd like
> to get some experience with it first. Try it out and report back.
> The SourceForge forum for the project is the best place to report 
> problems.
>
> John Nagle
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: 
> http://mail.python.org/mailman/options/python-dev/fuzzyman%40voidspace.org.uk 
>


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From jnoller at gmail.com  Sat Jun 26 16:28:50 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sat, 26 Jun 2010 10:28:50 -0400
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
	thread usage, and a path to getting rid of the GIL
In-Reply-To: <4C2600B4.5020503@voidspace.org.uk>
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
Message-ID: <AANLkTik0GOiNln5edddpzguFfZO7pi4RnWF_vAltlghm@mail.gmail.com>

On Sat, Jun 26, 2010 at 9:29 AM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
> On 26/06/2010 07:11, John Nagle wrote:
>>
>> We have just released a proof-of-concept implementation of a new
>> approach to thread management - "newthreading". It is available
>> for download at
>>
>> https://sourceforge.net/projects/newthreading/
>>
>> The user's guide is at
>>
>> http://www.animats.com/papers/languages/newthreadingintro.html
>
> The user guide says:
>
> The suggested import is
>
> from newthreading import *
>
> The import * form is considered bad practise in *general* and should not be
> recommended unless there is a good reason. This is slightly off-topic for
> python-dev, although I appreciate that you want feedback with the eventual
> goal of producing a PEP - however the introduction of free-threading in
> Python has not been hampered by lack of synchronization primitives but by
> the difficulty of changing the interpreter without unduly impacting single
> threaded code.
>

I asked John to drop a message here for this project - so feel free to
flame me if anyone. This *is* relevant, and I'd guess fairly
interesting to the group as a whole.

jesse

From solipsis at pitrou.net  Sat Jun 26 16:34:12 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sat, 26 Jun 2010 16:34:12 +0200
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
 thread usage, and a path to getting rid of the GIL
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
Message-ID: <20100626163412.25b68be6@pitrou.net>

On Sat, 26 Jun 2010 14:29:24 +0100
Michael Foord <fuzzyman at voidspace.org.uk> wrote:
> 
> the introduction of 
> free-threading in Python has not been hampered by lack of 
> synchronization primitives but by the difficulty of changing the 
> interpreter without unduly impacting single threaded code.

Exactly what I think too.

cheers

Antoine.


From jnoller at gmail.com  Sat Jun 26 16:44:15 2010
From: jnoller at gmail.com (Jesse Noller)
Date: Sat, 26 Jun 2010 10:44:15 -0400
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
	thread usage, and a path to getting rid of the GIL
In-Reply-To: <4C2600B4.5020503@voidspace.org.uk>
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
Message-ID: <AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com>

On Sat, Jun 26, 2010 at 9:29 AM, Michael Foord
<fuzzyman at voidspace.org.uk> wrote:
> On 26/06/2010 07:11, John Nagle wrote:
>>
>> We have just released a proof-of-concept implementation of a new
>> approach to thread management - "newthreading". It is available
>> for download at
>>
>> https://sourceforge.net/projects/newthreading/
>>
>> The user's guide is at
>>
>> http://www.animats.com/papers/languages/newthreadingintro.html
>
> The user guide says:
>
> The suggested import is
>
> from newthreading import *
>
> The import * form is considered bad practise in *general* and should not be
> recommended unless there is a good reason. This is slightly off-topic for
> python-dev, although I appreciate that you want feedback with the eventual
> goal of producing a PEP - however the introduction of free-threading in
> Python has not been hampered by lack of synchronization primitives but by
> the difficulty of changing the interpreter without unduly impacting single
> threaded code.
>
> Providing an alternative garbage collection mechanism other than reference
> counting would be a more interesting first-step as far as I can see, as that
> removes the locking required around every access to an object (which
> currently touches the reference count). Introducing free-threading by
> *changing* the threading semantics (so you can't share non-frozen objects
> between threads) would not be acceptable. That comment is likely to be based
> on a misunderstanding of your future intentions though. :-)
>
> All the best,
>
> Michael Foord

I'd also like to point out, that one of the project John cites is Adam
Olsen's Safethread work:

http://code.google.com/p/python-safethread/

Which, in and of itself is a good read.

From stephen at xemacs.org  Sat Jun 26 19:24:50 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 27 Jun 2010 02:24:50 +0900
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <4C25B319.8040804@canterbury.ac.nz>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<i039jr$h8$1@dough.gmane.org> <4C25B319.8040804@canterbury.ac.nz>
Message-ID: <87d3vdn1ul.fsf@uwakimon.sk.tsukuba.ac.jp>

Greg Ewing writes:

 > Would there be any sanity in having an option to compile
 > Python with UTF-8 as the internal string representation?

Losing Py_UNICODE as mentioned by Stefan Behnel (IIRC) is just the
beginning of the pain.

If Emacs's experience is any guide, the cost in speed and complexity
of a variable-width internal representation is high.  There are a
number of tricks you can use, but basically everything becomes O(n)
for the natural implementation of most operations (such as indexing by
character).  You can get around that with a position cache, of course,
but that adds complexity, and really cuts into the space saving (and
worse, adds another chunk that may or may not be paged in when you
need it).

What we're considering is a system where buffers come in 1-, 2-, and
4-octet widechars, with automatic translation depending on content.
But the buffer is the primary random-access structure in Emacsen, so
optimizing it is probably worth our effort.  I doubt it would be worth
it for Python, but my intuitions here are not reliable.

From nagle at animats.com  Sat Jun 26 18:39:19 2010
From: nagle at animats.com (John Nagle)
Date: Sat, 26 Jun 2010 09:39:19 -0700
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
 thread usage, and a path to getting rid of the GIL
In-Reply-To: <AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com>
References: <4C259A25.1060705@animats.com>	<4C2600B4.5020503@voidspace.org.uk>
	<AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com>
Message-ID: <4C262D37.7020807@animats.com>

On 6/26/2010 7:44 AM, Jesse Noller wrote:
> On Sat, Jun 26, 2010 at 9:29 AM, Michael Foord
> <fuzzyman at voidspace.org.uk>  wrote:
>> On 26/06/2010 07:11, John Nagle wrote:
>>>
>>> We have just released a proof-of-concept implementation of a new
>>> approach to thread management - "newthreading".
....

>> The import * form is considered bad practise in *general* and
>> should not be recommended unless there is a good reason.

    I agree.  I just did that to make the examples cleaner.

>> however the introduction of free-threading in Python has not been
>> hampered by lack of synchronization primitives but by the
>> difficulty of changing the interpreter without unduly impacting
>> single threaded code.

     That's what I'm trying to address here.

>> Providing an alternative garbage collection mechanism other than
>> reference counting would be a more interesting first-step as far as
>> I can see, as that removes the locking required around every access
>> to an object (which currently touches the reference count).
>> Introducing free-threading by *changing* the threading semantics
>> (so you can't share non-frozen objects between threads) would not
>> be acceptable. That comment is likely to be based on a
>> misunderstanding of your future intentions though. :-)

     This work comes out of a discussion a few of us had at a restaurant
in Palo Alto after a Stanford talk by the group at Facebook which
is building a JIT compiler for PHP.  We were discussing how to
make threading both safe for the average programmer and efficient.
Javascript and PHP don't have threads at all; Python has safe
threading, but it's slow.  C/C++/Java all have race condition
problems, of course.  The Facebook guy pointed out that you
can't redefine a function dynamically in PHP, and they get
a performance win in their JIT by exploiting this.

     I haven't gone into the memory model in enough detail in the
technical paper.  The memory model I envision for this has three
memory zones:

     1.  Shared fully-immutable objects: primarily strings, numbers,
and tuples, all of whose elements are fully immutable.  These can
be shared without locking, and reclaimed by a concurrent garbage
collector like Boehm's.  They have no destructors, so finalization
is not an issue.

     2.  Local objects.  These are managed as at present, and
require no locking.  These can either be thread-local, or local
to a synchronized object.  There are no links between local
objects under different "ownership".  Whether each thread and
object has its own private heap, or whether there's a common heap with
locks at the allocator is an implementation decision.

     3.  Shared mutable objects: mostly synchronized objects, but
also immutable objects like tuples which contain references
to objects that aren't fully immutable.  These are the high-overhead
objects, and require locking during reference count updates, or
atomic reference count operations if supported by the hardware.
The general idea is to minimize the number of objects in this
zone.

     The zone of an object is determined when the object is created,
and never changes.   This is relatively simple to implement.
Tuples (and frozensets, frozendicts, etc.) are normally zone 2
objects.  Only "freeze" creates collections in zones 1 and 3.
Synchronized objects are always created in zone 3.
There are no difficult handoffs, where an object that was previously
thread-local now has to be shared and has to acquire locks during
the transition.

     Existing interlinked data structures, like parse trees and GUIs,
are by default zone 2 objects, with the same semantics as at
present.  They can be placed inside a SynchronizedObject if
desired, which makes them usable from multiple threads.
That's optional; they're thread-local otherwise.

     The rationale behind "freezing" some of the language semantics
when the program goes multi-thread comes from two sources -
Adam Olsen's Safethread work, and the acceptance of the
multiprocessing module.  Olsen tried to retain all the dynamism of
the language in a multithreaded environment, but locking all the
underlying dictionaries was a boat-anchor on the whole system,
and slowed things down so much that he abandoned the project.
The Unladen Swallow documentation indicates that early thinking
on the project was that Olsen's approach would allow getting
rid of the GIL, but later notes indicate that no path to a
GIL-free JIT system is currently in development.

     The multiprocessing module provides semantics similar to
threading with "freezing".  Data passed between processes is "frozen"
by pickling.  Processes can't modify each other's code.  Restrictive
though the multiprocessing module is, it appears to be useful.
It is sometimes recommended as the Pythonic approach to multi-core CPUs.
This is an indication that "freezing" is not unacceptable to the
user community.

     Most of the real-world use cases for extreme dynamism
involve events that happen during startup.  Configuration files are
read, modules are selectively included, functions are overridden, tables
of references to functions are set up, regular expressions are compiled,
and the code is brought into the appropriately configured state.  Then
the worker threads are started and the real work starts. The
"newthreading" approach allows all that.

     After two decades of failed attempts remove the Global
Interpreter Lock without making performance worse, it is perhaps
time to take a harder look at scaleable threading semantics.

					John Nagle
					Animats

From pje at telecommunity.com  Sat Jun 26 20:17:44 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sat, 26 Jun 2010 14:17:44 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
Message-ID: <20100626181753.601473A4108@sparrow.telecommunity.com>

At 12:42 PM 6/26/2010 +0900, Stephen J. Turnbull wrote:
>What I'm saying here is that if bytes are the signal of validity, and
>the stdlib functions preserve validity, then it's better to have the
>stdlib functions object to unicode data as an argument.  Compare the
>alternative: it returns a unicode object which might get passed around
>for a while before one of your functions receives it and identifies it
>as unvalidated data.

I still don't follow, since passing in bytes should return 
bytes.  Returning unicode would be an error, in the case of a 
"polymorphic" function (per Guido).


>But you agree that there are better mechanisms for validation
>(although not available in Python yet), so I don't see this as an
>potential obstacle to polymorphism now.

Nope.  I'm just saying that, given two bytestrings to url-join or 
path join or whatever, a polymorph should hand back a 
bytestring.  This seems pretty uncontroversial.


>  > What I want is for the stdlib to create stringlike objects of a
>  > type determined by the types of the inputs --
>
>In general this is a hard problem, though.  Polymorphism, OK, one-way
>tainting OK, but in general combining related types is pretty
>arbitrary, and as in the encoded-bytes case, the result type often
>varies depending on expectations of callers, not the types of the
>data.

But the caller can enforce those expectations by passing in arguments 
whose types do what they want in such cases, as long as the string 
literals used by the function don't get to override the relevant 
parts of the string protocol(s).

The idea that I'm proposing is that the basic string and byte types 
should defer to "user-defined" string types for mixed type 
operations, so that polymorphism of string-manipulation functions is 
the *default* case, rather than a *special* case.  This makes 
tainting easier to implement, as well as optimizing and other special 
cases (like my "source string w/file and line info", or a string with 
font/formatting attributes).


>_______________________________________________
>Python-Dev mailing list
>Python-Dev at python.org
>http://mail.python.org/mailman/listinfo/python-dev
>Unsubscribe: 
>http://mail.python.org/mailman/options/python-dev/pje%40telecommunity.com


From doko at ubuntu.com  Sat Jun 26 22:06:30 2010
From: doko at ubuntu.com (Matthias Klose)
Date: Sat, 26 Jun 2010 22:06:30 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>	<4C23D3C2.1060500@scottdial.com>	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>	<4C246E81.3020302@scottdial.com>
	<A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
Message-ID: <4C265DC6.4080600@ubuntu.com>

On 25.06.2010 22:12, James Y Knight wrote:
>
> On Jun 25, 2010, at 4:53 AM, Scott Dial wrote:
>
>> On 6/24/2010 8:23 PM, James Y Knight wrote:
>>> On Jun 24, 2010, at 5:53 PM, Scott Dial wrote:
>>>> If the package has .so files that aren't compatible with other version
>>>> of python, then what is the motivation for placing that in a shared
>>>> location (since it can't actually be shared)
>>>
>>> Because python looks for .so files in the same place it looks for the
>>> .py files of the same package.
>>
>> My suggestion was that a package that contains .so files should not be
>> shared (e.g., the entire lxml package should be placed in a
>> version-specific path). The motivation for this PEP was to simplify the
>> installation python packages for distros; it was not to reduce the
>> number of .py files on the disk.
>>
>> Placing .so files together does not simplify that install process in any
>> way. You will still have to handle such packages in a special way.
>
>
> This is a good point, but I think still falls short of a solution. For a
> package like lxml, indeed you are correct. Since debian needs to build
> it once per version, it could just put the entire package (.py files and
> .so files) into a different per-python-version directory.

This is what is currently done.  This will increase the size of packages by 
duplicating the .py files, or you have to install the .py in a common location 
(irrelevant to sys.path), and provide (sym)links to the expected location.

A "different per-python-version directory" also has the disadvantage that file 
conflicts between (distribution) packages cannot be detected.

> However, then you have to also consider python packages made up of
> multiple distro packages -- like twisted or zope. Twisted includes some
> C extensions in the core package. But then there are other twisted
> modules (installed under a "twisted.foo" name) which do not include C
> extensions. If the base twisted package is installed under a
> version-specific directory, then all of the submodule packages need to
> also be installed under the same version-specific directory (and thus
> built for all versions).
>
> In the past, it has proven somewhat tricky to coordinate which directory
> the modules for package "foo" should be installed in, because you need
> to know whether *any* of the related packages includes a native ".so"
> file, not just the current package.
>
> The converse situation, where a base package did *not* get installed
> into a version-specific directory because it includes no native code,
> but a submodule *does* include a ".so" file, is even trickier.

I don't think that installation into different locations based on the presence 
of extension will work.  Should a location really change if an extension is 
added as an optimization?  Splitting a (python) package into different 
installation locations should be avoided.

   Matthias

From doko at ubuntu.com  Sat Jun 26 22:14:54 2010
From: doko at ubuntu.com (Matthias Klose)
Date: Sat, 26 Jun 2010 22:14:54 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTilDuY2bWXY8Yc_FaICCknv-zHsQox1F6n3ybREt@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>	<4C246E81.3020302@scottdial.com>	<A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
	<AANLkTilDuY2bWXY8Yc_FaICCknv-zHsQox1F6n3ybREt@mail.gmail.com>
Message-ID: <4C265FBE.9070809@ubuntu.com>

On 26.06.2010 02:19, Nick Coghlan wrote:
> On Sat, Jun 26, 2010 at 6:12 AM, James Y Knight<foom at fuhm.net>  wrote:
>> However, then you have to also consider python packages made up of multiple
>> distro packages -- like twisted or zope. Twisted includes some C extensions
>> in the core package. But then there are other twisted modules (installed
>> under a "twisted.foo" name) which do not include C extensions. If the base
>> twisted package is installed under a version-specific directory, then all of
>> the submodule packages need to also be installed under the same
>> version-specific directory (and thus built for all versions).
>>
>> In the past, it has proven somewhat tricky to coordinate which directory the
>> modules for package "foo" should be installed in, because you need to know
>> whether *any* of the related packages includes a native ".so" file, not just
>> the current package.
>>
>> The converse situation, where a base package did *not* get installed into a
>> version-specific directory because it includes no native code, but a
>> submodule *does* include a ".so" file, is even trickier.
>
> I think there are two major ways to tackle this:
> - allow multiple versions of a .so file within a single directory (i.e
> Barry's current suggestion)

we already do this, see the naming of the extensions of a python debug build on 
Windows.  Several distributions (Debian, Fedora, Ubuntu) do use this as well to 
provide extensions for python debug builds.

> - enhanced namespace packages, allowing a single package to be spread
> across multiple directories, some of which may be Python version
> specific (i.e. modifications to PEP 382 to support references to
> version-specific directories)

this is not what I want to use in a distribution.  package management systems 
like rpm and dpkg do handle conflicts and replacements of files pretty well, 
having the same file in potentially different locations in the file system 
doesn't help detecting conflicts and duplicate packages.

   Matthias

From doko at ubuntu.com  Sat Jun 26 22:22:29 2010
From: doko at ubuntu.com (Matthias Klose)
Date: Sat, 26 Jun 2010 22:22:29 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100624164637.22fd9160@heresy>
References: <20100624115048.4fd152e3@heresy>	<AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>	<20100624135119.00b9ac5c@heresy>	<AANLkTindH5uADbSwan-xWV08YcDaEKI3CleaFjhdmHvX@mail.gmail.com>	<20100624142830.4c859faf@limelight.wooz.org>
	<20100624164637.22fd9160@heresy>
Message-ID: <4C266185.7080509@ubuntu.com>

On 24.06.2010 22:46, Barry Warsaw wrote:
> On Jun 24, 2010, at 02:28 PM, Barry Warsaw wrote:
>
>> On Jun 24, 2010, at 01:00 PM, Benjamin Peterson wrote:
>>
>>> 2010/6/24 Barry Warsaw<barry at python.org>:
>>>> On Jun 24, 2010, at 10:58 AM, Benjamin Peterson wrote:
>>>>
>>>>> 2010/6/24 Barry Warsaw<barry at python.org>:
>>>>>> Please let me know what you think.  I'm happy to just commit this to the
>>>>>> py3k branch if there are no objections<wink>.  I don't think a new PEP is
>>>>>> in order, but an update to PEP 3147 might make sense.
>>>>>
>>>>> How will this interact with PEP 384 if that is implemented?
>>>> I'm trying to come up with something that will work immediately while PEP 384
>>>> is being adopted.
>>>
>>> But how will modules specify that they support multiple ABIs then?
>>
>> I didn't understand, so asked Benjamin for clarification in IRC.
>>
>> <gutworth>  barry: if python 3.3 will only load x.3.3.so, but x.3.2.so supports
>>            the stable abi, will it load it?  [14:25]
>> <barry>  gutworth: thanks, now i get it :)  [14:26]
>> <barry>  gutworth: i think it should, but it wouldn't under my scheme.  let me
>>         think about it
>
> So, we could say that PEP 384 compliant extension modules would get written
> without a version specifier.  IOW, we'd treat foo.so as using the ABI.  It
> would then be up to the Python runtime to throw ImportErrors if in fact we
> were loading a legacy, non-PEP 384 compliant extension.

Is it realistic to never break the ABI?  I would think of having the ABI encoded 
in the file name as well, and only bump the ABI if it does change.  With the 
"versioned .so files" proposal an ABI bump is necessary with every python 
version, with PEP 384 the ABI bump will be decoupled from the python version.

   Matthias

From doko at ubuntu.com  Sat Jun 26 22:25:28 2010
From: doko at ubuntu.com (Matthias Klose)
Date: Sat, 26 Jun 2010 22:25:28 +0200
Subject: [Python-Dev] FHS compliance of Python installation
In-Reply-To: <876318lynt.fsf_-_@benfinney.id.au>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<876318lynt.fsf_-_@benfinney.id.au>
Message-ID: <4C266238.2020107@ubuntu.com>

On 25.06.2010 02:54, Ben Finney wrote:
> James Y Knight<foom at fuhm.net>  writes:
>
>> Really, python should store the .py files in /usr/share/python/, the
>> .so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and the .pyc
>> files in /var/lib/python2.5- debug. But python doesn't work like that.
>
> +1
>
> So who's going to draft the ?Filesystem Hierarchy Standard compliance?
> PEP? :-)

This has nothing to do with the FHS.  The FHS talks about data, not code.

From ctb at msu.edu  Sat Jun 26 22:30:27 2010
From: ctb at msu.edu (C. Titus Brown)
Date: Sat, 26 Jun 2010 13:30:27 -0700
Subject: [Python-Dev] FHS compliance of Python installation
In-Reply-To: <4C266238.2020107@ubuntu.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<876318lynt.fsf_-_@benfinney.id.au> <4C266238.2020107@ubuntu.com>
Message-ID: <20100626203024.GA19754@idyll.org>

On Sat, Jun 26, 2010 at 10:25:28PM +0200, Matthias Klose wrote:
> On 25.06.2010 02:54, Ben Finney wrote:
>> James Y Knight<foom at fuhm.net>  writes:
>>
>>> Really, python should store the .py files in /usr/share/python/, the
>>> .so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and the .pyc
>>> files in /var/lib/python2.5- debug. But python doesn't work like that.
>>
>> +1
>>
>> So who's going to draft the ???Filesystem Hierarchy Standard compliance???
>> PEP? :-)
>
> This has nothing to do with the FHS.  The FHS talks about data, not code.

Really?  It has some guidelines here for object files, etc., at least as
of 2004.

http://www.pathname.com/fhs/pub/fhs-2.3.html

A quick scan suggests /usr/lib is the right place to look:

http://www.pathname.com/fhs/pub/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA

cheers,
--titus
-- 
C. Titus Brown, ctb at msu.edu

From doko at ubuntu.com  Sat Jun 26 22:35:40 2010
From: doko at ubuntu.com (Matthias Klose)
Date: Sat, 26 Jun 2010 22:35:40 +0200
Subject: [Python-Dev] FHS compliance of Python installation
In-Reply-To: <20100626203024.GA19754@idyll.org>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<876318lynt.fsf_-_@benfinney.id.au> <4C266238.2020107@ubuntu.com>
	<20100626203024.GA19754@idyll.org>
Message-ID: <4C26649C.1000507@ubuntu.com>

On 26.06.2010 22:30, C. Titus Brown wrote:
> On Sat, Jun 26, 2010 at 10:25:28PM +0200, Matthias Klose wrote:
>> On 25.06.2010 02:54, Ben Finney wrote:
>>> James Y Knight<foom at fuhm.net>   writes:
>>>
>>>> Really, python should store the .py files in /usr/share/python/, the
>>>> .so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and the .pyc
>>>> files in /var/lib/python2.5- debug. But python doesn't work like that.
>>>
>>> +1
>>>
>>> So who's going to draft the ???Filesystem Hierarchy Standard compliance???
>>> PEP? :-)
>>
>> This has nothing to do with the FHS.  The FHS talks about data, not code.
>
> Really?  It has some guidelines here for object files, etc., at least as
> of 2004.
>
> http://www.pathname.com/fhs/pub/fhs-2.3.html
>
> A quick scan suggests /usr/lib is the right place to look:
>
> http://www.pathname.com/fhs/pub/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA

agreed for object files, but
http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA
explicitely states "The /usr/share hierarchy is for all read-only architecture 
independent *data* files".

From doko at ubuntu.com  Sat Jun 26 22:45:54 2010
From: doko at ubuntu.com (Matthias Klose)
Date: Sat, 26 Jun 2010 22:45:54 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>	<4C246E81.3020302@scottdial.com>
	<AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
Message-ID: <4C266702.4010102@ubuntu.com>

On 25.06.2010 20:58, Brett Cannon wrote:
> On Fri, Jun 25, 2010 at 01:53, Scott Dial
>> Placing .so files together does not simplify that install process in any
>> way. You will still have to handle such packages in a special way. You
>> must still compile the package multiple times for each relevant version
>> of python (with special tagging that I imagine distutils can take care
>> of) and, worse yet, you have created a more trick install than merely
>> having multiple search paths (e.g., installing/uninstalling lxml for
>> *one* version of python is actually more difficult in this scheme).
>
> This is meant to be used by distros in a programmatic fashion, so my
> response is "so what?" Their package management system is going to
> maintain the directory, not a person. You and I are not going to be
> using this for anything. This is purely meant for Linux OS vendors
> (maybe OS X) to manage their installs through their package software.
> I honestly do not expect human beings to be mucking around with these
> installs (and I suspect Barry doesn't either).

Placing files for a distribution in a version-independent path does help 
distributions handling file conflicts, detecting duplicates and with moving 
files between different (distribution) packages.

Having non-conflicting extension names is a schema which already is used on some 
platforms (debug builds on Windows).  The question for me is, if just a renaming 
of the .so files is acceptable for upstream, or if distributors should implement 
this on their own, as something like:

   if ext_path.startswith('/usr/') and not ext_path.startswith('/usr/local/'):
     load_ext('foo.2.6.so')
   else:
     load_ext('foo.so')

I fear this will cause issues when e.g. virtualenv environments start copying 
parts from the system installation instead of symlinking it.

   Matthias

From bugtrack at roumenpetrov.info  Sat Jun 26 22:40:07 2010
From: bugtrack at roumenpetrov.info (Roumen Petrov)
Date: Sat, 26 Jun 2010 23:40:07 +0300
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
Message-ID: <4C2665A7.6080601@roumenpetrov.info>

Brett Cannon wrote:
> I finally realized why clang has not been silencing its warnings about
> unused return values: I have -Wno-unused-value set in CFLAGS which
> comes before OPT (which defines -Wall) as set in PY_CFLAGS in
> Makefile.pre.in.
>
> I could obviously set OPT in my environment, but that would override
> the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
> but the README says that's for stuff that tweak binary compatibility.
>
> So basically what I am asking is what environment variable should I
> use? If CFLAGS is correct then does anyone have any issues if I change
> the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
> after OPT?

It is not important to me as flags set to BASECFLAGS, CFLAGS, OPT or 
EXTRA_CFLAGS will set makefile macros CFLAGS and after distribution 
python distutil will use them to build extension modules. So all 
variable are equal for builds.

Also after configure without OPT variable set we could check what script 
select for build platform and to rerun configure with OPT+own_flags set 
on command line (! ;) ) .

Roumen

From foom at fuhm.net  Sat Jun 26 23:10:42 2010
From: foom at fuhm.net (James Y Knight)
Date: Sat, 26 Jun 2010 17:10:42 -0400
Subject: [Python-Dev] FHS compliance of Python installation
In-Reply-To: <4C26649C.1000507@ubuntu.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<876318lynt.fsf_-_@benfinney.id.au> <4C266238.2020107@ubuntu.com>
	<20100626203024.GA19754@idyll.org> <4C26649C.1000507@ubuntu.com>
Message-ID: <FE2C9460-5111-41A5-8405-C79D888DFB26@fuhm.net>


On Jun 26, 2010, at 4:35 PM, Matthias Klose wrote:

> On 26.06.2010 22:30, C. Titus Brown wrote:
>> On Sat, Jun 26, 2010 at 10:25:28PM +0200, Matthias Klose wrote:
>>> On 25.06.2010 02:54, Ben Finney wrote:
>>>> James Y Knight<foom at fuhm.net>   writes:
>>>>
>>>>> Really, python should store the .py files in /usr/share/python/,  
>>>>> the
>>>>> .so files in /usr/lib/x86_64- linux-gnu/python2.5-debug/, and  
>>>>> the .pyc
>>>>> files in /var/lib/python2.5- debug. But python doesn't work like  
>>>>> that.
>>>>
>>>> +1
>>>>
>>>> So who's going to draft the ???Filesystem Hierarchy Standard  
>>>> compliance???
>>>> PEP? :-)
>>>
>>> This has nothing to do with the FHS.  The FHS talks about data,  
>>> not code.
>>
>> Really?  It has some guidelines here for object files, etc., at  
>> least as
>> of 2004.
>>
>> http://www.pathname.com/fhs/pub/fhs-2.3.html
>>
>> A quick scan suggests /usr/lib is the right place to look:
>>
>> http://www.pathname.com/fhs/pub/fhs-2.3.html#USRLIBLIBRARIESFORPROGRAMMINGANDPA
>
> agreed for object files, but
> http://www.pathname.com/fhs/pub/fhs-2.3.html#USRSHAREARCHITECTUREINDEPENDENTDATA
> explicitely states "The /usr/share hierarchy is for all read-only  
> architecture independent *data* files".

I always figured the "read-only architecture independent" bit was the  
important part there, and "code is data". Emacs's el files go into / 
usr/share/emacs, for instance.

James

From tjreedy at udel.edu  Sun Jun 27 00:11:03 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sat, 26 Jun 2010 18:11:03 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <51EFE211-DBCA-497E-9BC5-CC0D2256173E@twistedmatrix.com>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<51EFE211-DBCA-497E-9BC5-CC0D2256173E@twistedmatrix.com>
Message-ID: <i05tto$ndo$1@dough.gmane.org>

The several posts in this and other threads go me to think about text 
versus number computing (which I am more familiar with).

For numbers, we have in Python three builtins, the general purpose ints 
and floats and the more specialized complex. Two other rational types 
can be imported for specialized uses. And then there are 3rd-party 
libraries like mpz and numpy with more number and array of number types.

What makes these all potentially work together is the special method 
system, including, in particular, the rather complete set of __rxxx__ 
number methods. The latter allow non-commutative operations to be mixed 
either way and ease mixed commutative operations.

For text, we have general purpose str and encoded bytes (and bytearry). 
I think these are sufficient for general use and I am not sure there 
should even be anything else in the stdlib. But I think it should be 
possible to experiment with and use specialized 3rd-party text classes 
just as one can with number classes.

I can imagine that inter-operation, when appropriate, might work better 
with addition of a couple of  missing __rxxx__ methods, such as the 
mentioned __rcontains__. Although adding such would affect the 
implementation of a core syntax feature, it would not affect syntax as 
such as seen by the user.

-- 
Terry Jan Reedy


From brett at python.org  Sun Jun 27 00:30:43 2010
From: brett at python.org (Brett Cannon)
Date: Sat, 26 Jun 2010 15:30:43 -0700
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
Message-ID: <AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>

On Wed, Jun 23, 2010 at 14:53, Brett Cannon <brett at python.org> wrote:
> I finally realized why clang has not been silencing its warnings about
> unused return values: I have -Wno-unused-value set in CFLAGS which
> comes before OPT (which defines -Wall) as set in PY_CFLAGS in
> Makefile.pre.in.
>
> I could obviously set OPT in my environment, but that would override
> the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
> but the README says that's for stuff that tweak binary compatibility.
>
> So basically what I am asking is what environment variable should I
> use? If CFLAGS is correct then does anyone have any issues if I change
> the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
> after OPT?
>

Since no one objected I swapped the order in r82259. In case anyone
else uses clang to compile Python, this means that -Wno-unused-value
will now work to silence the warning about unused return values that
is caused by some macros. Probably using -Wno-empty-body is also good
to avoid all the warnings triggered by the UCS4 macros in cjkcodecs.

From scott+python-dev at scottdial.com  Sun Jun 27 00:50:27 2010
From: scott+python-dev at scottdial.com (Scott Dial)
Date: Sat, 26 Jun 2010 18:50:27 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C265DC6.4080600@ubuntu.com>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>	<4C23D3C2.1060500@scottdial.com>	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>	<4C246E81.3020302@scottdial.com>	<A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
	<4C265DC6.4080600@ubuntu.com>
Message-ID: <4C268433.30405@scottdial.com>

On 6/26/2010 4:06 PM, Matthias Klose wrote:
> On 25.06.2010 22:12, James Y Knight wrote:
>> On Jun 25, 2010, at 4:53 AM, Scott Dial wrote:
>>> Placing .so files together does not simplify that install process in any
>>> way. You will still have to handle such packages in a special way.
>>
>> This is a good point, but I think still falls short of a solution. For a
>> package like lxml, indeed you are correct. Since debian needs to build
>> it once per version, it could just put the entire package (.py files and
>> .so files) into a different per-python-version directory.
> 
> This is what is currently done.  This will increase the size of packages
> by duplicating the .py files, or you have to install the .py in a common
> location (irrelevant to sys.path), and provide (sym)links to the
> expected location.

"This is what is currently done"  and "provide (sym)links to the
expected location" are conflicting statements. If you are symlinking .py
files from a shared location, then that is not the same as "just install
the package into a version-specific location". What motivation is there
for preferring symlinks?

Who cares if a ditro package install yields duplicate .py files? Nor am
I motivated by having to carry duplicate .py files in a distribution
package (I imagine the compression of duplicate .py files is amazing).

> A "different per-python-version directory" also has the disadvantage
> that file conflicts between (distribution) packages cannot be detected.

Why? That sounds like a broken tool, maybe I am naive, please explain.
If two packages install /usr/lib/python2.6/foo.so that should be just as
detectable two installing /usr/lib/python-shared/foo.cpython-26.so

If you *must* compile .so files for every supported version of python at
packaging time, then you are already saying the set of python versions
is known. I fail to see the difference between a package that installs
.py and .so files into many directories than having many .so files in a
single directory; except that many directories *already* works. The only
gain I can see is that you save duplicate .py files in the package and
on the filesystem, and I don't feel that gain alone warrants this
fundamental change.

I would appreciate a proper explanation of why/how a single directory is
better for your distribution. Also, I haven't heard anyone that wasn't
using debian tools chime in with support for any of this, so I would
like to know how this can help RPMs and ebuilds and the like.

> I don't think that installation into different locations based on the
> presence of extension will work.  Should a location really change if an
> extension is added as an optimization?  Splitting a (python) package
> into different installation locations should be avoided.

I'm not sure why changing paths would matter; any package that writes
data in its install location would be considered broken by your distro
already, so what harm is there in having the packaging tool move it
later? Your tool will remove the old path and place it in a new path.

All of these shenanigans seem to manifest from your distro's
python-support/-central design, which seems to be entirely motivated by
reducing duplicate files and *not* simplifying the packaging. While this
plan works rather well with .py files, the devil is in the details. I
don't think Python should be getting involved in what I believe is a
flawed design.

What happens to the distro packaging if a python package splits the
codebase between 2.x and 3.x (meaning they have distinct .py files)? As
someone else mentioned, how is virtualenv going to interact with
packages that install like this?

-- 
Scott Dial
scott at scottdial.com
scodial at cs.indiana.edu

From mal at egenix.com  Sun Jun 27 01:37:02 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 27 Jun 2010 01:37:02 +0200
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
Message-ID: <4C268F1E.5070506@egenix.com>

Brett Cannon wrote:
> On Wed, Jun 23, 2010 at 14:53, Brett Cannon <brett at python.org> wrote:
>> I finally realized why clang has not been silencing its warnings about
>> unused return values: I have -Wno-unused-value set in CFLAGS which
>> comes before OPT (which defines -Wall) as set in PY_CFLAGS in
>> Makefile.pre.in.
>>
>> I could obviously set OPT in my environment, but that would override
>> the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
>> but the README says that's for stuff that tweak binary compatibility.
>>
>> So basically what I am asking is what environment variable should I
>> use? If CFLAGS is correct then does anyone have any issues if I change
>> the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
>> after OPT?
>>
> 
> Since no one objected I swapped the order in r82259. In case anyone
> else uses clang to compile Python, this means that -Wno-unused-value
> will now work to silence the warning about unused return values that
> is caused by some macros. Probably using -Wno-empty-body is also good
> to avoid all the warnings triggered by the UCS4 macros in cjkcodecs.

I think you need to come up with a different solution and revert
the change...

OPT has historically been the only variable to use for
adjusting the Python C compiler settings.

As the name implies this was usually used to adjust the
optimizer settings, including raising the optimization level
from the default or disabling it.

With your change CFLAGS will always override OPT and thus
any optimization definitions made in OPT will no longer
have an effect.

Note that CFLAGS defines -O2 on many platforms.

In your particular case, you should try setting OPT to
"... -Wno-unused-value ..." (ie. replace -Wall with your
setting).

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                21 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From brett at python.org  Sun Jun 27 02:13:20 2010
From: brett at python.org (Brett Cannon)
Date: Sat, 26 Jun 2010 17:13:20 -0700
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <4C268F1E.5070506@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com> 
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com> 
	<4C268F1E.5070506@egenix.com>
Message-ID: <AANLkTilSHp1JJinM-dQBRXqIPXkajMvvBkiglJTJqovZ@mail.gmail.com>

On Sat, Jun 26, 2010 at 16:37, M.-A. Lemburg <mal at egenix.com> wrote:
> Brett Cannon wrote:
>> On Wed, Jun 23, 2010 at 14:53, Brett Cannon <brett at python.org> wrote:
>>> I finally realized why clang has not been silencing its warnings about
>>> unused return values: I have -Wno-unused-value set in CFLAGS which
>>> comes before OPT (which defines -Wall) as set in PY_CFLAGS in
>>> Makefile.pre.in.
>>>
>>> I could obviously set OPT in my environment, but that would override
>>> the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
>>> but the README says that's for stuff that tweak binary compatibility.
>>>
>>> So basically what I am asking is what environment variable should I
>>> use? If CFLAGS is correct then does anyone have any issues if I change
>>> the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
>>> after OPT?
>>>
>>
>> Since no one objected I swapped the order in r82259. In case anyone
>> else uses clang to compile Python, this means that -Wno-unused-value
>> will now work to silence the warning about unused return values that
>> is caused by some macros. Probably using -Wno-empty-body is also good
>> to avoid all the warnings triggered by the UCS4 macros in cjkcodecs.
>
> I think you need to come up with a different solution and revert
> the change...
>
> OPT has historically been the only variable to use for
> adjusting the Python C compiler settings.

Just found the relevant section in the README.

>
> As the name implies this was usually used to adjust the
> optimizer settings, including raising the optimization level
> from the default or disabling it.

It meant optional to me, not optimization. I hate abbreviations sometimes.

>
> With your change CFLAGS will always override OPT and thus
> any optimization definitions made in OPT will no longer
> have an effect.

That was the point; OPT defines defaults through configure.in and I
simply wanted to add to those instead of having OPT completely
overwritten by me.

>
> Note that CFLAGS defines -O2 on many platforms.

So then wouldn't that mean they want that to be the optimization
level? Or is the historical reason that default exists is so that some
default exists but to expect the application to override as desired?

>
> In your particular case, you should try setting OPT to
> "... -Wno-unused-value ..." (ie. replace -Wall with your
> setting).

So what is CFLAGS for then? ``configure -h`` says it's for "C compiler
flags"; that's extremely ambiguous. And it doesn't help that OPT is
not mentioned by ``configure -h`` as that is what I have always gone
by to know what flags are available for compilation.

-Brett

>
> --
> Marc-Andre Lemburg
> eGenix.com
>
> Professional Python Services directly from the Source ?(#1, Jun 27 2010)
>>>> Python/Zope Consulting and Support ... ? ? ? ?http://www.egenix.com/
>>>> mxODBC.Zope.Database.Adapter ... ? ? ? ? ? ? http://zope.egenix.com/
>>>> mxODBC, mxDateTime, mxTextTools ... ? ? ? ?http://python.egenix.com/
> ________________________________________________________________________
> 2010-07-19: EuroPython 2010, Birmingham, UK ? ? ? ? ? ? ? ?21 days to go
>
> ::: Try our new mxODBC.Connect Python Database Interface for free ! ::::
>
>
> ? eGenix.com Software, Skills and Services GmbH ?Pastor-Loeh-Str.48
> ? ?D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
> ? ? ? ? ? Registered at Amtsgericht Duesseldorf: HRB 46611
> ? ? ? ? ? ? ? http://www.egenix.com/company/contact/
>

From ncoghlan at gmail.com  Sun Jun 27 04:43:23 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Jun 2010 12:43:23 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100626181753.601473A4108@sparrow.telecommunity.com>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
Message-ID: <AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>

On Sun, Jun 27, 2010 at 4:17 AM, P.J. Eby <pje at telecommunity.com> wrote:
> The idea that I'm proposing is that the basic string and byte types should
> defer to "user-defined" string types for mixed type operations, so that
> polymorphism of string-manipulation functions is the *default* case, rather
> than a *special* case. ?This makes tainting easier to implement, as well as
> optimizing and other special cases (like my "source string w/file and line
> info", or a string with font/formatting attributes).

Rather than building this into the base string type, perhaps it would
be better (at least initially) to add in a polymorphic str subtype
that worked along the following lines:

1. Has an encoded argument in the constructor (e.g. poly_str("/", encoded=b"/")
2. If given objects with an encode() method, assumes they're strings
and uses its own parent class methods
3. If given objects with a decode() method, assumes they're encoded
and delegates to the encoded attribute

str/bytes agnostic functions would need to invoke poly_str
deliberately, while bytes-only and text-only algorithms could just use
the appropriate literals.

Third party types would be supported to some degree (by having either
encode or decode methods), although they could still run into trouble
with some operations (While full support for third party strings and
byte sequence implementations is an interesting idea, I think it's
overkill for the specific problem of making it easier to write
str/bytes agnostic functions for tasks like URL parsing).

Regards,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From ncoghlan at gmail.com  Sun Jun 27 04:59:07 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Jun 2010 12:59:07 +1000
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <i05tto$ndo$1@dough.gmane.org>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<51EFE211-DBCA-497E-9BC5-CC0D2256173E@twistedmatrix.com>
	<i05tto$ndo$1@dough.gmane.org>
Message-ID: <AANLkTinOQlQuGJX0Loem_WP3INQhbkVkcpg1F_a0H9uu@mail.gmail.com>

On Sun, Jun 27, 2010 at 8:11 AM, Terry Reedy <tjreedy at udel.edu> wrote:
> I can imagine that inter-operation, when appropriate, might work better with
> addition of a couple of ?missing __rxxx__ methods, such as the mentioned
> __rcontains__. Although adding such would affect the implementation of a
> core syntax feature, it would not affect syntax as such as seen by the user.

The problem with strings isn't really the binary operations like
__contains__ - adding __rcontains__ would be a fairly simple
extrapolation of the existing approaches.

Where it gets really messy for strings is the fact that whereas
invoking named methods directly on numbers is rare, invoking them on
strings is very common, and some of those methods (e.g. split(),
join(), __mod__()) allow or require an iterable rather than a single
object. This extends the range of use cases to be covered beyond those
with syntactic support to potentially include all string methods that
take arguments. Creating minimally surprising semantics for the
methods which accept iterables is also rather challenging.

It's an interesting idea, but I think it's overkill for the specific
problem of making it easier to perform more text-like manipulations in
a bytes-only domain.

Cheers,
NIck.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From pje at telecommunity.com  Sun Jun 27 05:49:11 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sat, 26 Jun 2010 23:49:11 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.c
 om>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
Message-ID: <20100627034922.31A663A4108@sparrow.telecommunity.com>

At 12:43 PM 6/27/2010 +1000, Nick Coghlan wrote:
>While full support for third party strings and
>byte sequence implementations is an interesting idea, I think it's
>overkill for the specific problem of making it easier to write
>str/bytes agnostic functions for tasks like URL parsing.

OTOH, to write your partial implementation is almost as complex - it 
still must take into account joining and formatting, and so by that 
point, you've just proposed a new protocol for coercion...  so why 
not just make the coercion protocol explicit in the first place, 
rather than hardwiring a third type's worth of special cases?

Remember, bytes and strings already have to detect mixed-type 
operations.  If there was an API for that, then the hardcoded special 
cases would just be replaced, or supplemented with type slot checks 
and calls after the special cases.

To put it another way, if you already have two types special-casing 
their interactions with each other, then rather than add a *third* 
type to that mix, maybe it's time to have a protocol instead, so that 
the types that care can do the special-casing themselves, and you 
generalize to N user types.

(Btw, those who are saying that the resulting potential for N*N 
interaction makes the feature unworkable seem to be overlooking 
metaclasses and custom numeric types -- two Python features that in 
principle have the exact same problem, when you use them beyond a 
certain scope.  At least with those features, though, you can 
generally mix your user-defined metaclasses or numeric types with the 
Python-supplied basic ones and call arbitrary Python functions on 
them, without as much heartbreak as you'll get with a from-scratch 
stringlike object.)

All that having been said, a new protocol probably falls under the 
heading of the language moratorium, unless it can be considered "new 
methods on builtins"?  (But that seems like a stretch even to me.)

I just hate the idea that functions taking strings should have to be 
*rewritten* to be explicitly type-agnostic.  It seems *so* 
un-Pythonic...  like if all the bitmasking functions you'd ever 
written using 32-bit int constants had to be rewritten just because 
we added longs to the language, and you had to upcast them to be 
compatible or something.  Sounds too much like C or Java or some 
other non-Python language, where dynamism and polymorphy are the 
special case, instead of the general rule. 


From jyasskin at gmail.com  Sun Jun 27 07:46:24 2010
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Sat, 26 Jun 2010 22:46:24 -0700
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <4C268F1E.5070506@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com> 
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com> 
	<4C268F1E.5070506@egenix.com>
Message-ID: <AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>

On Sat, Jun 26, 2010 at 4:37 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Brett Cannon wrote:
>> On Wed, Jun 23, 2010 at 14:53, Brett Cannon <brett at python.org> wrote:
>>> I finally realized why clang has not been silencing its warnings about
>>> unused return values: I have -Wno-unused-value set in CFLAGS which
>>> comes before OPT (which defines -Wall) as set in PY_CFLAGS in
>>> Makefile.pre.in.
>>>
>>> I could obviously set OPT in my environment, but that would override
>>> the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
>>> but the README says that's for stuff that tweak binary compatibility.
>>>
>>> So basically what I am asking is what environment variable should I
>>> use? If CFLAGS is correct then does anyone have any issues if I change
>>> the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
>>> after OPT?
>>>
>>
>> Since no one objected I swapped the order in r82259. In case anyone
>> else uses clang to compile Python, this means that -Wno-unused-value
>> will now work to silence the warning about unused return values that
>> is caused by some macros. Probably using -Wno-empty-body is also good
>> to avoid all the warnings triggered by the UCS4 macros in cjkcodecs.
>
> I think you need to come up with a different solution and revert
> the change...
>
> OPT has historically been the only variable to use for
> adjusting the Python C compiler settings.
>
> As the name implies this was usually used to adjust the
> optimizer settings, including raising the optimization level
> from the default or disabling it.
>
> With your change CFLAGS will always override OPT and thus
> any optimization definitions made in OPT will no longer
> have an effect.
>
> Note that CFLAGS defines -O2 on many platforms.
>
> In your particular case, you should try setting OPT to
> "... -Wno-unused-value ..." (ie. replace -Wall with your
> setting).

The python configure environment variables are really confused. If OPT
is intended to be user-overridden for optimization settings, it
shouldn't be used to set -Wall and -Wstrict-prototypes. If it's
intended to set warning options, it shouldn't also set optimization
options. Setting the user-visible customization option on the
configure command line shouldn't stomp unrelated defaults.

In configure-based systems, CFLAGS is traditionally
(http://sources.redhat.com/automake/automake.html#Flag-Variables-Ordering)
the way to tack options onto the end of the command line. Python
breaks this by threading flags through CFLAGS in the makefile, which
means they all get stomped if the user sets CFLAGS on the make command
line. We should instead use another spelling ("CFlags"?) for the
internal variable, and append $(CFLAGS) to it.

AC_PROG_CC is the macro that sets CFLAGS to -g -O2 on gcc-based
systems (http://www.gnu.org/software/hello/manual/autoconf/C-Compiler.html#index-AC_005fPROG_005fCC-842).
If Python's configure.in sets an otherwise-empty CFLAGS to -g before
calling AC_PROG_CC, AC_PROG_CC won't change it. Or we could just
preserve the users CFLAGS setting across AC_PROG_CC regardless of
whether it's set, to let the user set CFLAGS on the configure line
without stomping any defaults.

From ncoghlan at gmail.com  Sun Jun 27 07:53:59 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Sun, 27 Jun 2010 15:53:59 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100627034922.31A663A4108@sparrow.telecommunity.com>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
Message-ID: <AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>

On Sun, Jun 27, 2010 at 1:49 PM, P.J. Eby <pje at telecommunity.com> wrote:
> I just hate the idea that functions taking strings should have to be
> *rewritten* to be explicitly type-agnostic. ?It seems *so* un-Pythonic...
> ?like if all the bitmasking functions you'd ever written using 32-bit int
> constants had to be rewritten just because we added longs to the language,
> and you had to upcast them to be compatible or something. ?Sounds too much
> like C or Java or some other non-Python language, where dynamism and
> polymorphy are the special case, instead of the general rule.

The difference is that we have three classes of algorithm here:
- those that work only on octet sequences
- those that work only on character sequences
- those that can work on either

Python 2 lumped all 3 classes of algorithm together through the
multi-purpose 8-bit str type. The unicode type provided some scope to
separate out the second category, but the divisions were rather
blurry.

Python 3 forces the first two to be separated by using either octets
(bytes/bytearray) or characters (str). There are a *very small* number
of APIs where it is appropriate to be polymorphic, but this is
currently difficult due to the need to supply literals of the
appropriate type for the objects being operated on.

This isn't ever going to happen automagically due to the need to
explicitly provide two literals (one for octet sequences, one for
character sequences).

The virtues of a separate poly_str type are that:
1. It can be simple and implemented in Python, dispatching to str or
bytes as appropriate (probably in the strings module)
2. No chance of impacting the performance of the core interpreter (as
builtins are not affected)
3. Lower impact if it turns out to have been a bad idea

We could talk about this even longer, but the most effective way
forward is going to be a patch that improves the URL parsing
situation.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From solipsis at pitrou.net  Sun Jun 27 11:10:59 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 27 Jun 2010 11:10:59 +0200
Subject: [Python-Dev] bytes / unicode
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.c om>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
Message-ID: <20100627111059.49cdb698@pitrou.net>

On Sat, 26 Jun 2010 23:49:11 -0400
"P.J. Eby" <pje at telecommunity.com> wrote:
> 
> Remember, bytes and strings already have to detect mixed-type 
> operations.

Not in Python 3. They just raise a TypeError on bad
("mixed-type") arguments.

Regards

Antoine.


From greg.ewing at canterbury.ac.nz  Sun Jun 27 11:48:22 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Sun, 27 Jun 2010 21:48:22 +1200
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <i04i0s$qjq$2@dough.gmane.org>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<i039jr$h8$1@dough.gmane.org> <4C25B319.8040804@canterbury.ac.nz>
	<i04i0s$qjq$2@dough.gmane.org>
Message-ID: <4C271E66.5050902@canterbury.ac.nz>

Stefan Behnel wrote:
> Greg Ewing, 26.06.2010 09:58:
> 
>> Would there be any sanity in having an option to compile
>> Python with UTF-8 as the internal string representation?
> 
> It would break Py_UNICODE, because the internal size of a unicode 
> character would no longer be fixed.

It's not fixed anyway with the 2-char build -- some
characters are represented using a pair of surrogates.

-- 
Greg

From g.brandl at gmx.net  Sun Jun 27 11:41:56 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 27 Jun 2010 11:41:56 +0200
Subject: [Python-Dev] Adopt A Demo [was: Signs of neglect?]
In-Reply-To: <i03b71$538$1@dough.gmane.org>
References: <i03b71$538$1@dough.gmane.org>
Message-ID: <i076g0$de5$1@dough.gmane.org>

Am 26.06.2010 00:38, schrieb Steve Holden:
> I was pretty stunned when I tried this. Remember that the Tools
> subdirectory is distributed with Windows, so this means we got through
> almost two releases without anyone realizing that 2to3 does not appear
> to have touched this code.
> 
> Yes, I have: http://bugs.python.org/issue9083
> 
> When's 3.2 due out?

The alpha stage is beginning next week; still enough time to fix the
Tools and Demos.  I can do some of the work, however, if I have to do
it all, I'll just throw out the majority of that stuff.

So -- if every dev "adopted" a Tool or Demo, that would be quite a
manageable piece of work, and maybe a few demos can be brought up
to scratch instead of be deleted.

I'll go ahead and promise to care for the "Demo/classes" subdir.

Georg

-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From g.brandl at gmx.net  Sun Jun 27 11:44:31 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Sun, 27 Jun 2010 11:44:31 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <hvoqvm$a8o$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>	<hvm6cu$gaq$1@dough.gmane.org>	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>	<hvo3ln$55n$1@dough.gmane.org>	<hvogc0$tpt$2@dough.gmane.org>
	<hvoqvm$a8o$1@dough.gmane.org>
Message-ID: <i076kq$de5$2@dough.gmane.org>

Am 22.06.2010 01:01, schrieb Terry Reedy:
> On 6/21/2010 3:59 PM, Steve Holden wrote:
>> Terry Reedy wrote:
>>> On 6/21/2010 8:33 AM, Nick Coghlan wrote:
>>>
>>>> P.S. (We're going to have a tough decision to make somewhere along the
>>>> line where docs.python.org is concerned, too - when do we flick the
>>>> switch and make a 3.x version of the docs the default?
>>>
>>> Easy. When 3.2 is released. When 2.7 is released, 3.2 becomes 'trunk'.
>>> Trunk released always take over docs.python.org. To do otherwise would
>>> be to say that 3.2 is not a real trunk release and not yet ready for
>>> real use -- a major slam.
>>>
>>> Actually, I thought this was already discussed and decided ;-).
>>>
>> This also gives the 2.7 release it's day in the sun before relegation to
>> maintenance status.
> 
> Every new version (except 3.0 and 3.1) has gone to maintenance status 
> *and* becomes the featured release on docs.python.org the day it was 
> released.  2.7 would just spend less time as the featured release on 
> that page.

I'm not sure 3.2 should take over in December just yet.  (There's also
docs3.python.org that always lands at the latest 3.x documentation).

However, there will be enough time to discuss this when 3.2 is actually
about to be released.

Georg


-- 
Thus spake the Lord: Thou shalt indent with four spaces. No more, no less.
Four shall be the number of spaces thou shalt indent, and the number of thy
indenting shall be four. Eight shalt thou not indent, nor either indent thou
two, excepting that thou then proceed to four. Tabs are right out.


From dickinsm at gmail.com  Sun Jun 27 11:57:08 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Sun, 27 Jun 2010 10:57:08 +0100
Subject: [Python-Dev] Adopt A Demo [was: Signs of neglect?]
In-Reply-To: <i076g0$de5$1@dough.gmane.org>
References: <i03b71$538$1@dough.gmane.org>
	<i076g0$de5$1@dough.gmane.org>
Message-ID: <AANLkTim6hPMn6p6SkuU-0bavFPAK4TIKVRaeCOqJBqx7@mail.gmail.com>

On Sun, Jun 27, 2010 at 10:41 AM, Georg Brandl <g.brandl at gmx.net> wrote:
> So -- if every dev "adopted" a Tool or Demo, that would be quite a
> manageable piece of work, and maybe a few demos can be brought up
> to scratch instead of be deleted.
>
> I'll go ahead and promise to care for the "Demo/classes" subdir.

Bagsy the Demo/parser subdirectory.  Fixing up unparse.py looks like
it could be fun.

Mark

From eric at trueblade.com  Sun Jun 27 12:53:00 2010
From: eric at trueblade.com (Eric Smith)
Date: Sun, 27 Jun 2010 06:53:00 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <4C271E66.5050902@canterbury.ac.nz>
References: <11597.1277401099@parc.com>	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>	<i039jr$h8$1@dough.gmane.org>
	<4C25B319.8040804@canterbury.ac.nz>	<i04i0s$qjq$2@dough.gmane.org>
	<4C271E66.5050902@canterbury.ac.nz>
Message-ID: <4C272D8C.6010406@trueblade.com>

On 6/27/2010 5:48 AM, Greg Ewing wrote:
> Stefan Behnel wrote:
>> Greg Ewing, 26.06.2010 09:58:
>>
>>> Would there be any sanity in having an option to compile
>>> Python with UTF-8 as the internal string representation?
>>
>> It would break Py_UNICODE, because the internal size of a unicode
>> character would no longer be fixed.
>
> It's not fixed anyway with the 2-char build -- some
> characters are represented using a pair of surrogates.
>

But isn't this currently ignored everywhere in python's code?

Eric.


From stephen at xemacs.org  Sun Jun 27 16:03:06 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Sun, 27 Jun 2010 23:03:06 +0900
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100626181753.601473A4108@sparrow.telecommunity.com>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
Message-ID: <877hlkmv39.fsf@uwakimon.sk.tsukuba.ac.jp>

P.J. Eby writes:
 > At 12:42 PM 6/26/2010 +0900, Stephen J. Turnbull wrote:
 > >What I'm saying here is that if bytes are the signal of validity, and
 > >the stdlib functions preserve validity, then it's better to have the
 > >stdlib functions object to unicode data as an argument.  Compare the
 > >alternative: it returns a unicode object which might get passed around
 > >for a while before one of your functions receives it and identifies it
 > >as unvalidated data.
 > 
 > I still don't follow,

OK, I give up, since it was your use case that concerned me.  I
obviously misunderstood.  Sorry for the confusion.

    Sign me,
    +1 on polymorphic functions in Tsukuba Japan

 > >In general this is a hard problem, though.  Polymorphism, OK, one-way
 > >tainting OK, but in general combining related types is pretty
 > >arbitrary, and as in the encoded-bytes case, the result type often
 > >varies depending on expectations of callers, not the types of the
 > >data.
 > 
 > But the caller can enforce those expectations by passing in arguments 
 > whose types do what they want in such cases, as long as the string 
 > literals used by the function don't get to override the relevant 
 > parts of the string protocol(s).

This simply isn't true for encoded bytes as proposed.  For encoded
text, the current encoding has no deterministic relationship to the
desired encoding (at the level of generality of the stdlib; of course
in specific applications it may be mandated by a standard or private
convention).

I will have to pass on your other user-defined string types.  I've
never tried to implement one.  I only wanted to point out that a
user-controllable tainted string type would be preferable to
confounding "unicode" with "tainted".


From alexander.belopolsky at gmail.com  Sun Jun 27 16:47:08 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Sun, 27 Jun 2010 10:47:08 -0400
Subject: [Python-Dev] Adopt A Demo [was: Signs of neglect?]
In-Reply-To: <AANLkTim6hPMn6p6SkuU-0bavFPAK4TIKVRaeCOqJBqx7@mail.gmail.com>
References: <i03b71$538$1@dough.gmane.org> <i076g0$de5$1@dough.gmane.org>
	<AANLkTim6hPMn6p6SkuU-0bavFPAK4TIKVRaeCOqJBqx7@mail.gmail.com>
Message-ID: <AANLkTilGc_3LcUmgYCPgMUDvDzl8AN9ChKN9tvO87h8L@mail.gmail.com>

On Sun, Jun 27, 2010 at 5:57 AM, Mark Dickinson <dickinsm at gmail.com> wrote:
> On Sun, Jun 27, 2010 at 10:41 AM, Georg Brandl <g.brandl at gmx.net> wrote:
>> So -- if every dev "adopted" a Tool or Demo, that would be quite a
>> manageable piece of work, and maybe a few demos can be brought up
>> to scratch instead of be deleted.
>>
>> I'll go ahead and promise to care for the "Demo/classes" subdir.
>
> Bagsy the Demo/parser subdirectory. ?Fixing up unparse.py looks like
> it could be fun.

I have a patch for pybench attached to a not so related issue at
http://bugs.python.org/issue5180 .  All it took was a 2to3 run and a
one line change.  Of course it need a review before it can go in, but
I am surprised that something like pybench was not updated long time
ago.  Is it supposed to be single source?  That would make sense given
the nature of the tool.


>
> Mark
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/alexander.belopolsky%40gmail.com
>

From mal at egenix.com  Sun Jun 27 18:33:53 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Sun, 27 Jun 2010 18:33:53 +0200
Subject: [Python-Dev] Adopt A Demo [was: Signs of neglect?]
In-Reply-To: <AANLkTilGc_3LcUmgYCPgMUDvDzl8AN9ChKN9tvO87h8L@mail.gmail.com>
References: <i03b71$538$1@dough.gmane.org>
	<i076g0$de5$1@dough.gmane.org>	<AANLkTim6hPMn6p6SkuU-0bavFPAK4TIKVRaeCOqJBqx7@mail.gmail.com>
	<AANLkTilGc_3LcUmgYCPgMUDvDzl8AN9ChKN9tvO87h8L@mail.gmail.com>
Message-ID: <4C277D71.1010802@egenix.com>

Alexander Belopolsky wrote:
> On Sun, Jun 27, 2010 at 5:57 AM, Mark Dickinson <dickinsm at gmail.com> wrote:
>> On Sun, Jun 27, 2010 at 10:41 AM, Georg Brandl <g.brandl at gmx.net> wrote:
>>> So -- if every dev "adopted" a Tool or Demo, that would be quite a
>>> manageable piece of work, and maybe a few demos can be brought up
>>> to scratch instead of be deleted.
>>>
>>> I'll go ahead and promise to care for the "Demo/classes" subdir.
>>
>> Bagsy the Demo/parser subdirectory.  Fixing up unparse.py looks like
>> it could be fun.
> 
> I have a patch for pybench attached to a not so related issue at
> http://bugs.python.org/issue5180 .  All it took was a 2to3 run and a
> one line change.  Of course it need a review before it can go in, but
> I am surprised that something like pybench was not updated long time
> ago.  Is it supposed to be single source? 

Yes, the idea was to keep the number of changes to a minimum and to
have the Python3 version work with Python 2.6, 2.7 and 3.x.

Antoine worked on that, AFAIR.

The Python2 version of pybench needs to work with more than just
Python 2.6 and 2.7 to be able to compare performance of the various
releases back to version 2.3.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 27 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                21 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From pje at telecommunity.com  Sun Jun 27 19:02:28 2010
From: pje at telecommunity.com (P.J. Eby)
Date: Sun, 27 Jun 2010 13:02:28 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.c
 om>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
	<AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
Message-ID: <20100627170805.1785F3A4099@sparrow.telecommunity.com>

At 03:53 PM 6/27/2010 +1000, Nick Coghlan wrote:
>We could talk about this even longer, but the most effective way
>forward is going to be a patch that improves the URL parsing
>situation.

Certainly, it's the only practical solution for the immediate problems in 3.2.

I only mentioned that I "hate the idea" because I'd be more 
comfortable if it was explicitly declared to be a temporary hack to 
work around the absence of a string coercion protocol, due to the 
moratorium on language changes.

But, since the moratorium *is* in effect, I'll try to make this my 
last post on string protocols for a while...  and maybe wait until 
I've looked at the code (str/bytes C implementations) in more detail 
and can make a more concrete proposal for what the protocol would be 
and how it would work.  (Not to mention closer to the end of the moratorium.)


>There are a *very small* number of APIs where it is appropriate to 
>be polymorphic

This is only true if you focus exclusively on bytes vs. unicode, 
rather than the general issue that it's currently impractical to pass 
*any* sort of user-defined string type through code that you don't 
directly control (stdlib or third-party).


>The virtues of a separate poly_str type are that:
>1. It can be simple and implemented in Python, dispatching to str or
>bytes as appropriate (probably in the strings module)
>2. No chance of impacting the performance of the core interpreter (as
>builtins are not affected)

Note that adding a string coercion protocol isn't going to change 
core performance for existing cases, since any place where the 
protocol would be invoked would be a code branch that either throws 
an error or *already* falls back to some other protocol (e.g. the 
buffer protocol).


>3. Lower impact if it turns out to have been a bad idea

How many protocols have been added that turned out to be bad 
ideas?  The only ones that have been removed in 3.x, IIRC, are 
three-way compare, slice-specific operations, and __coerce__...  and 
I'm going to miss __cmp__.  ;-)

However, IIUC, the reason these protocols were dropped isn't because 
they were "bad ideas".  Rather, they're things that can be 
implemented in terms of a finer-grained protocol.  i.e., if you want 
__cmp__ or __getslice__ or __coerce__, you can always implement them 
via a mixin that converts the newer fine-grained protocols into 
invocations of the older protocol.  (As I plan to do for __cmp__ in 
the handful of places I use it.)

At the moment, however, this isn't possible for multi-string 
operations outside of __add__/__radd__ and comparison -- the coercion 
rules are hard-wired and can't be overridden by user-defined types.


From solipsis at pitrou.net  Sun Jun 27 19:50:33 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Sun, 27 Jun 2010 19:50:33 +0200
Subject: [Python-Dev] pybench
References: <i03b71$538$1@dough.gmane.org> <i076g0$de5$1@dough.gmane.org>
	<AANLkTim6hPMn6p6SkuU-0bavFPAK4TIKVRaeCOqJBqx7@mail.gmail.com>
	<AANLkTilGc_3LcUmgYCPgMUDvDzl8AN9ChKN9tvO87h8L@mail.gmail.com>
Message-ID: <20100627195033.224713c2@pitrou.net>

On Sun, 27 Jun 2010 10:47:08 -0400
Alexander Belopolsky <alexander.belopolsky at gmail.com> wrote:
> 
> I have a patch for pybench attached to a not so related issue at
> http://bugs.python.org/issue5180 .  All it took was a 2to3 run and a
> one line change.  Of course it need a review before it can go in, but
> I am surprised that something like pybench was not updated long time
> ago.

Why do you say that? pybench works fine under Python 3 (the py3k branch
version of pybench, that is). The patch doesn't look necessary to me.


From tjreedy at udel.edu  Sun Jun 27 21:03:31 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 27 Jun 2010 15:03:31 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <i076kq$de5$2@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>	<hvm6cu$gaq$1@dough.gmane.org>	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>	<hvo3ln$55n$1@dough.gmane.org>	<hvogc0$tpt$2@dough.gmane.org>	<hvoqvm$a8o$1@dough.gmane.org>
	<i076kq$de5$2@dough.gmane.org>
Message-ID: <i087a3$cpg$1@dough.gmane.org>

On 6/27/2010 5:44 AM, Georg Brandl wrote:
> Am 22.06.2010 01:01, schrieb Terry Reedy:
>> On 6/21/2010 3:59 PM, Steve Holden wrote:
>>> Terry Reedy wrote:
>>>> On 6/21/2010 8:33 AM, Nick Coghlan wrote:
>>>>
>>>>> P.S. (We're going to have a tough decision to make somewhere along the
>>>>> line where docs.python.org is concerned, too - when do we flick the
>>>>> switch and make a 3.x version of the docs the default?
>>>>
>>>> Easy. When 3.2 is released. When 2.7 is released, 3.2 becomes 'trunk'.
>>>> Trunk released always take over docs.python.org. To do otherwise would
>>>> be to say that 3.2 is not a real trunk release and not yet ready for
>>>> real use -- a major slam.
>>>>
>>>> Actually, I thought this was already discussed and decided ;-).
>>>>
>>> This also gives the 2.7 release it's day in the sun before relegation to
>>> maintenance status.
>>
>> Every new version (except 3.0 and 3.1) has gone to maintenance status
>> *and* becomes the featured release on docs.python.org the day it was
>> released.  2.7 would just spend less time as the featured release on
>> that page.
>
> I'm not sure 3.2 should take over in December just yet.  (There's also
> docs3.python.org that always lands at the latest 3.x documentation).
>
> However, there will be enough time to discuss this when 3.2 is actually
> about to be released.

Sure. Since I expect that the argument for treating 3.2 as a regular 
production-use-ready release will be stronger then than now, I agree on 
differing discussion.

-- 
Terry Jan Reedy


From martin at v.loewis.de  Sun Jun 27 21:25:06 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Sun, 27 Jun 2010 21:25:06 +0200
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <201006211113.06767.stephan.richter@gmail.com>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikPkjakahWprT9QvgN1RUOWb3keT8sTY5AKkV49@mail.gmail.com>	<AANLkTilEmtv8vGzuZAOVDlQIYvvoezGH4yNiNg8bIHmo@mail.gmail.com>
	<201006211113.06767.stephan.richter@gmail.com>
Message-ID: <4C27A592.8010206@v.loewis.de>

Am 21.06.2010 17:13, schrieb Stephan Richter:
> On Monday, June 21, 2010, Nick Coghlan wrote:
>> A decent listing of major packages that already support Python 3 would
>> be very handy for the new Python2orPython3 page I created on the wiki,
>> and easier to keep up-to-date. (the old Early2to3Migrations page
>> didn't look particularly up to date, but hopefully we can keep the new
>> list in a happier state).
> 
> I really just want to be able to go to PyPI, Click on "Browse packages" and 
> then select "Python 3" (it can currently be accomplished by clicking "Python" 
> and then  "3"). 

Or you can use the link "Python 3 packages" on PyPI's main menu.

Regards,
Martin

From bugtrack at roumenpetrov.info  Sun Jun 27 21:25:16 2010
From: bugtrack at roumenpetrov.info (Roumen Petrov)
Date: Sun, 27 Jun 2010 22:25:16 +0300
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTilSHp1JJinM-dQBRXqIPXkajMvvBkiglJTJqovZ@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTilSHp1JJinM-dQBRXqIPXkajMvvBkiglJTJqovZ@mail.gmail.com>
Message-ID: <4C27A59C.6040005@roumenpetrov.info>

Brett Cannon wrote:
> On Sat, Jun 26, 2010 at 16:37, M.-A. Lemburg<mal at egenix.com>  wrote:
>> Brett Cannon wrote:
>>> On Wed, Jun 23, 2010 at 14:53, Brett Cannon<brett at python.org>  wrote:
[SKIP]
>>> Since no one objected I swapped the order in r82259. In case anyone
>>> else uses clang to compile Python, this means that -Wno-unused-value
>>> will now work to silence the warning about unused return values that
>>> is caused by some macros. Probably using -Wno-empty-body is also good
>>> to avoid all the warnings triggered by the UCS4 macros in cjkcodecs.


Right now you cannot  change order of CFLAGS and OPT


>> I think you need to come up with a different solution and revert
>> the change...
>>
>> OPT has historically been the only variable to use for
>> adjusting the Python C compiler settings.
>
> Just found the relevant section in the README.
>
>>
>> As the name implies this was usually used to adjust the
>> optimizer settings, including raising the optimization level
>> from the default or disabling it.
>
> It meant optional to me, not optimization. I hate abbreviations sometimes.
>
>>
>> With your change CFLAGS will always override OPT and thus
>> any optimization definitions made in OPT will no longer
>> have an effect.
>
> That was the point; OPT defines defaults through configure.in and I
> simply wanted to add to those instead of having OPT completely
> overwritten by me.

Now if you confirm that  (see configure.in ) :
      # Optimization messes up debuggers, so turn it off for
      # debug builds.
     OPT="-g -O0 -Wall $STRICT_PROTO"
is not issue for py3k then left you commit as is (Note that Mark point 
this).
But if optimization "messes up debuggers" you may revert change.


I know that is difficult to reach consensus on compiler/preprocessor 
flags for python build process. Next is a shot list with  issues about this:
- "Python 2.5 64 bit compile fails on Solaris 10/gcc 4.1.1" : 
http://bugs.python.org/issue1628484  (committed/rejected)
- "Python does not honor "CFLAGS" environment variable" : 
http://bugs.python.org/issue1453 (wont fix)
- "configure: allow user-provided CFLAGS to override AC_PROG_CC 
defaults" : http://bugs.python.org/issue8211 (fixed)

This is still open "configure doesn't set up CFLAGS properly" ( 
http://bugs.python.org/issue1104249 ) - must be closed as fixed.


>> Note that CFLAGS defines -O2 on many platforms.
>
> So then wouldn't that mean they want that to be the optimization
> level? Or is the historical reason that default exists is so that some
> default exists but to expect the application to override as desired?
>
>>
>> In your particular case, you should try setting OPT to
>> "... -Wno-unused-value ..." (ie. replace -Wall with your
>> setting).
>
> So what is CFLAGS for then? ``configure -h`` says it's for "C compiler
> flags"; that's extremely ambiguous. And it doesn't help that OPT is
> not mentioned by ``configure -h`` as that is what I have always gone
> by to know what flags are available for compilation.
>
> -Brett

If you like to see some flags the could you look into 
http://bugs.python.org/issue3718 how to define an option to be visible 
by configure --help. In addition AC_ARG_VAR will allow environment 
variable to be cached for subsequent run of config.status otherwise you 
must specify only on configure command line.

About all XXflags variables if is good configure script to be simplified 
to use only CPPFLAGS and CFLAGS to minimize configuration troubles and 
other build falures. A good sample if configure set 
preprocessor/compiler flags other then CPPFLAGS/CFLAGS is this issue 
"OSX: duplicate -arch flags in CFLAGS breaks sysconfig" ( 
http://bugs.python.org/issue8607 )

Roumen

From dickinsm at gmail.com  Sun Jun 27 21:43:34 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Sun, 27 Jun 2010 20:43:34 +0100
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <4C268F1E.5070506@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
Message-ID: <AANLkTinPBeVpytcZ5LgcPEOzvq48oZXCLrVPEpg_DQPX@mail.gmail.com>

On Sun, Jun 27, 2010 at 12:37 AM, M.-A. Lemburg <mal at egenix.com> wrote:
> Brett Cannon wrote:
>> On Wed, Jun 23, 2010 at 14:53, Brett Cannon <brett at python.org> wrote:
>>> I finally realized why clang has not been silencing its warnings about
>>> unused return values: I have -Wno-unused-value set in CFLAGS which
>>> comes before OPT (which defines -Wall) as set in PY_CFLAGS in
>>> Makefile.pre.in.
>>>
>>> I could obviously set OPT in my environment, but that would override
>>> the default OPT settings Python uses. I could put it in EXTRA_CFLAGS,
>>> but the README says that's for stuff that tweak binary compatibility.
>>>
>>> So basically what I am asking is what environment variable should I
>>> use? If CFLAGS is correct then does anyone have any issues if I change
>>> the order of things for PY_CFLAGS in the Makefile so that CFLAGS comes
>>> after OPT?
>>>
>>
>> Since no one objected I swapped the order in r82259. In case anyone
>> else uses clang to compile Python, this means that -Wno-unused-value
>> will now work to silence the warning about unused return values that
>> is caused by some macros. Probably using -Wno-empty-body is also good
>> to avoid all the warnings triggered by the UCS4 macros in cjkcodecs.
>
> I think you need to come up with a different solution and revert
> the change...

Agreed; this needs more thought.

For one thing, Brett's change has the result that --with-pydebug
builds end up being built with -O2 instead of -O0, which can make
debugging (e.g., with gdb) somewhat awkward.

Mark

From dickinsm at gmail.com  Sun Jun 27 22:04:56 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Sun, 27 Jun 2010 21:04:56 +0100
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
Message-ID: <AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>

On Sun, Jun 27, 2010 at 6:46 AM, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> AC_PROG_CC is the macro that sets CFLAGS to -g -O2 on gcc-based
> systems (http://www.gnu.org/software/hello/manual/autoconf/C-Compiler.html#index-AC_005fPROG_005fCC-842).
> If Python's configure.in sets an otherwise-empty CFLAGS to -g before
> calling AC_PROG_CC, AC_PROG_CC won't change it. Or we could just
> preserve the users CFLAGS setting across AC_PROG_CC regardless of
> whether it's set, to let the user set CFLAGS on the configure line
> without stomping any defaults.

I think saving and restoring CFLAGS across AC_PROG_CC was attempted in
http://bugs.python.org/issue8211 . It turned out that it broke OS X
universal builds.

I'm not sure I understand the importance of allowing AC_PROG_CC to set
CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC);  can
someone give an example of why this is necessary?

Mark

From tjreedy at udel.edu  Sun Jun 27 22:07:56 2010
From: tjreedy at udel.edu (Terry Reedy)
Date: Sun, 27 Jun 2010 16:07:56 -0400
Subject: [Python-Dev] #Python3 ! ? (was Python Library Support in 3.x)
In-Reply-To: <i087a3$cpg$1@dough.gmane.org>
References: <20100618050712.GC20639@thorne.id.au>	<AANLkTikAyK-AGQsMnc7_JEUwVRA2_TBcvMM1V-7S5lWN@mail.gmail.com>	<20100619121256.2412.244251859.divmod.xquotient.130@localhost.localdomain>	<AANLkTikN4Dtver6_lDH2DPLaHyjs63oik0_rJ27ditXq@mail.gmail.com>	<hvjlpt$8pe$1@dough.gmane.org>	<AANLkTim3bHnzdmdnhH-PVWNmG_wBjz4vR1Ybip_qdltr@mail.gmail.com>	<hvlu18$npp$1@dough.gmane.org>	<63486FB9-866D-47D3-AF04-0A621AB416A4@ikanobori.jp>	<AANLkTilabOkEeRRhdFZ8Uz5hDivmYqWigKj9ncnXpQa0@mail.gmail.com>	<hvm6cu$gaq$1@dough.gmane.org>	<AANLkTin_FFoJltyKKD3NTuT8CkIbopP7_ynsstuYPkgb@mail.gmail.com>	<AANLkTim-6Tfsur9rsIwbwY5VLUcEP_BdWP6ULBZ67lJP@mail.gmail.com>	<hvo3ln$55n$1@dough.gmane.org>	<hvogc0$tpt$2@dough.gmane.org>	<hvoqvm$a8o$1@dough.gmane.org>	<i076kq$de5$2@dough.gmane.org>
	<i087a3$cpg$1@dough.gmane.org>
Message-ID: <i08b2r$nnm$1@dough.gmane.org>


> Sure. Since I expect that the argument for treating 3.2 as a regular
> production-use-ready release will be stronger then than now, I agree on
> differing discussion.

I meant 'deferring'

-- 
Terry Jan Reedy


From jyasskin at gmail.com  Sun Jun 27 22:37:48 2010
From: jyasskin at gmail.com (Jeffrey Yasskin)
Date: Sun, 27 Jun 2010 13:37:48 -0700
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com> 
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com> 
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com> 
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
Message-ID: <AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>

On Sun, Jun 27, 2010 at 1:04 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
> On Sun, Jun 27, 2010 at 6:46 AM, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
>> AC_PROG_CC is the macro that sets CFLAGS to -g -O2 on gcc-based
>> systems (http://www.gnu.org/software/hello/manual/autoconf/C-Compiler.html#index-AC_005fPROG_005fCC-842).
>> If Python's configure.in sets an otherwise-empty CFLAGS to -g before
>> calling AC_PROG_CC, AC_PROG_CC won't change it. Or we could just
>> preserve the users CFLAGS setting across AC_PROG_CC regardless of
>> whether it's set, to let the user set CFLAGS on the configure line
>> without stomping any defaults.
>
> I think saving and restoring CFLAGS across AC_PROG_CC was attempted in
> http://bugs.python.org/issue8211 . It turned out that it broke OS X
> universal builds.

Thanks for the link to the issue. http://bugs.python.org/issue8366
says Ronald Oussoren fixed the universal builds without reverting the
CFLAGS propagation.

> I'm not sure I understand the importance of allowing AC_PROG_CC to set
> CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC); ?can
> someone give an example of why this is necessary?

Marc-Andre's argument seems to be "it's possible that AC_PROG_CC adds
other flags as well (it currently doesn't, but that may well change in
future versions of autoconf)." That seems a little weak to constrain
fixing actual problems today. If it ever adds more arguments, we'll
need to inspect them anyway to see if they're more like -g or -O2
(wanted or harmful).

Jeffrey

From brett at python.org  Sun Jun 27 22:50:23 2010
From: brett at python.org (Brett Cannon)
Date: Sun, 27 Jun 2010 13:50:23 -0700
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com> 
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com> 
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com> 
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com> 
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
Message-ID: <AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>

On Sun, Jun 27, 2010 at 13:37, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> On Sun, Jun 27, 2010 at 1:04 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
>> On Sun, Jun 27, 2010 at 6:46 AM, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
>>> AC_PROG_CC is the macro that sets CFLAGS to -g -O2 on gcc-based
>>> systems (http://www.gnu.org/software/hello/manual/autoconf/C-Compiler.html#index-AC_005fPROG_005fCC-842).
>>> If Python's configure.in sets an otherwise-empty CFLAGS to -g before
>>> calling AC_PROG_CC, AC_PROG_CC won't change it. Or we could just
>>> preserve the users CFLAGS setting across AC_PROG_CC regardless of
>>> whether it's set, to let the user set CFLAGS on the configure line
>>> without stomping any defaults.
>>
>> I think saving and restoring CFLAGS across AC_PROG_CC was attempted in
>> http://bugs.python.org/issue8211 . It turned out that it broke OS X
>> universal builds.
>
> Thanks for the link to the issue. http://bugs.python.org/issue8366
> says Ronald Oussoren fixed the universal builds without reverting the
> CFLAGS propagation.
>
>> I'm not sure I understand the importance of allowing AC_PROG_CC to set
>> CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC); ?can
>> someone give an example of why this is necessary?
>
> Marc-Andre's argument seems to be "it's possible that AC_PROG_CC adds
> other flags as well (it currently doesn't, but that may well change in
> future versions of autoconf)." That seems a little weak to constrain
> fixing actual problems today. If it ever adds more arguments, we'll
> need to inspect them anyway to see if they're more like -g or -O2
> (wanted or harmful).
>

I went ahead and reverted the change, but it does seem like the build
environment could use a cleanup.

From dickinsm at gmail.com  Sun Jun 27 22:54:06 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Sun, 27 Jun 2010 21:54:06 +0100
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
Message-ID: <AANLkTilHpxKOFpYxyA7lJ3is2mBbnDjmhYTLNiE5LUPL@mail.gmail.com>

On Sun, Jun 27, 2010 at 9:37 PM, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
> On Sun, Jun 27, 2010 at 1:04 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
>> I think saving and restoring CFLAGS across AC_PROG_CC was attempted in
>> http://bugs.python.org/issue8211 . It turned out that it broke OS X
>> universal builds.
>
> Thanks for the link to the issue. http://bugs.python.org/issue8366
> says Ronald Oussoren fixed the universal builds without reverting the
> CFLAGS propagation.

Yes, you're right (of course).  Thanks.  Looking at the current
configure.in, CFLAGS *does* get saved and restored across the
AC_PROG_CC call if it's non-empty;  I'm not sure whether this actually
(currently) has any effect, since as I understand the documentation
CFLAGS won't be touched by AC_PROG_CC if it's already set.

>> I'm not sure I understand the importance of allowing AC_PROG_CC to set
>> CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC); ?can
>> someone give an example of why this is necessary?
>
> Marc-Andre's argument seems to be "it's possible that AC_PROG_CC adds
> other flags as well (it currently doesn't, but that may well change in
> future versions of autoconf)." That seems a little weak to constrain
> fixing actual problems today. If it ever adds more arguments, we'll
> need to inspect them anyway to see if they're more like -g or -O2
> (wanted or harmful).

Okay;  thanks for the explanation.

Mark

From greg.ewing at canterbury.ac.nz  Mon Jun 28 00:35:36 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 28 Jun 2010 10:35:36 +1200
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <4C272D8C.6010406@trueblade.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<i039jr$h8$1@dough.gmane.org> <4C25B319.8040804@canterbury.ac.nz>
	<i04i0s$qjq$2@dough.gmane.org> <4C271E66.5050902@canterbury.ac.nz>
	<4C272D8C.6010406@trueblade.com>
Message-ID: <4C27D238.3060100@canterbury.ac.nz>

Eric Smith wrote:

> But isn't this currently ignored everywhere in python's code?

It's true that code using a utf-8 build would have to be
aware of the fact much more often. But I'm thinking of
applications that would otherwise want to keep all their
strings encoded to save memory. If they do that, they
also need to deal with sequence items not corresponding
to characters. If they can handle that, they may be able
to handle utf-8 just as well.

-- 
Greg

From rdmurray at bitdance.com  Mon Jun 28 01:31:21 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Sun, 27 Jun 2010 19:31:21 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
	<AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
Message-ID: <20100627233121.E1E0821948D@kimball.webabinitio.net>

I've been watching this discussion with intense interest, but have
been so lagged in following the thread that I haven't replied.
I got caught up today....

On Sun, 27 Jun 2010 15:53:59 +1000, Nick Coghlan <ncoghlan at gmail.com> wrote:
> The difference is that we have three classes of algorithm here:
> - those that work only on octet sequences
> - those that work only on character sequences
> - those that can work on either
> 
> Python 2 lumped all 3 classes of algorithm together through the
> multi-purpose 8-bit str type. The unicode type provided some scope to
> separate out the second category, but the divisions were rather
> blurry.
> 
> Python 3 forces the first two to be separated by using either octets
> (bytes/bytearray) or characters (str). There are a *very small* number
> of APIs where it is appropriate to be polymorphic, but this is
> currently difficult due to the need to supply literals of the
> appropriate type for the objects being operated on.
> 
> This isn't ever going to happen automagically due to the need to
> explicitly provide two literals (one for octet sequences, one for
> character sequences).

In email6 I'm currently handling this by putting the algorithm on a
base class and the literals on 'Bytes...' and 'String...'  subclasses as
class variables.  Slightly ugly, but it works.

The current design also speaks to an earlier point someone made about the
fact that we are really dealing with more complex, and domain specific,
data, not simply "byte strings".  A "BytesMessage" contains lots of
structured encoding information as well as the possibility of 'garbage'
bytes.  A StringMessage contains text and data decoded into objects
(ex: an image object), possibly with some PEP 383 surrogates included
(haven't quite figured that part out yet).  So, a BytesMessage object
isn't just a byte string, it's a load of structured data that requires
the associated algorithms to convert into meaningful text and objects.
Going the other way, the decisions made about character encodings need to
be encoded into the structured bytes representation that could ultimately
go out on the wire.

I suspect that the same thing needs to be done for URIs/IRIs, and
html/MIME and the corresponding text and objects.  It is my hope that
the email6 work will lay a firm foundation for the latter, but URI/IRI
is a whole different protocol that I'm glad I don't have to deal with :)

> The virtues of a separate poly_str type are that:

Having such a poly_str type would probably make my life easier.

I also would like just vent a little frustration at having to
use single-character-slice notation when I want to index a character
in a string in my algorithms....

--
R. David Murray                                      www.bitdance.com

From rdmurray at bitdance.com  Mon Jun 28 01:41:48 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Sun, 27 Jun 2010 19:41:48 -0400
Subject: [Python-Dev] thoughts on the bytes/string discussion
In-Reply-To: <26215.1277505652@parc.com>
References: <11597.1277401099@parc.com>
	<AANLkTimnuUVE91yQRRjAL9aSzgThLKBBAy-KhlPKv3WP@mail.gmail.com>
	<AANLkTilBTCgrL9AeBLk83vgwdLe2fuiWQxdctjPm01fH@mail.gmail.com>
	<96ADD4CE-3A24-45A7-B219-2940195DC3D0@twistedmatrix.com>
	<AANLkTik-X3aBxBZXGgX29tM5jW2_e3-lRHrWk1Zqbtu5@mail.gmail.com>
	<26215.1277505652@parc.com>
Message-ID: <20100627234148.9618021948F@kimball.webabinitio.net>

On Fri, 25 Jun 2010 15:40:52 -0700, Bill Janssen <janssen at parc.com> wrote:
> Guido van Rossum <guido at python.org> wrote:
> > So you're really just worried about space consumption. I'd like to see
> > a lot of hard memory profiling data before I got overly worried about
> > that.
> 
> While I've seen some big Web pages, I think the email folks, who often
> have to process messages with attachments measuring in the tens of
> megabytes, have the stronger problems here, and I think speed may be
> more important than memory.  I've built both a Web server and an IMAP
> server in Python, and the IMAP server is where the issues of storage
> management really prevail.  If you have to convert a 20 MB encoded
> string into a Unicode string just to look at the headers as strings, you
> have issues.  (The Python email package doesn't do that, by the way.)

Unfortunately in the current Python3 email package (email5), this is no
longer true.  You have to decode everything *first* in order to pass it
through email (which presents a few problems when dealing with 8bit data,
as has been mentioned here before).

eamil6 intends to fix this, and once again allow you to decode to text
only the bits you actually need to access and manipulate.

--
R. David Murray                                      www.bitdance.com

From rdmurray at bitdance.com  Mon Jun 28 02:00:17 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Sun, 27 Jun 2010 20:00:17 -0400
Subject: [Python-Dev] email package status in 3.X
In-Reply-To: <medczp3dq3tj4doi18062010025250@SMTP>
References: <medczp3dq3tj4doi18062010025250@SMTP>
Message-ID: <20100628000017.F3B732194BA@kimball.webabinitio.net>

On Fri, 18 Jun 2010 18:52:45 -0000, lutz at rmi.net wrote:
> What I'm suggesting is that extreme caution be exercised from
> this point forward with all things 3.X-related.  Whether you
> wish to accept this or not, 3.X has a negative image to many.
> This suggestion specifically includes not abandoning current
> 3.X email package users as a case in point.  Ripping the rug
> out from new 3.X users after they took the time to port seems
> like it may be just enough to tip the scales altogether.

Catching up on my python-dev email, I just want to clarify this with
respect to email.  (1) I suspect that the new API will be enough of a
carrot that they won't mind converting to it, BUT, (2) the plan is to
provide a compatibility API that will fully support the current Python3
email5 API (but with fewer bugs in areas such as header folding and
unfolding).

--
R. David Murray                                      www.bitdance.com

From greg at krypto.org  Mon Jun 28 06:33:36 2010
From: greg at krypto.org (Gregory P. Smith)
Date: Sun, 27 Jun 2010 21:33:36 -0700
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
	thread usage, and a path to getting rid of the GIL
In-Reply-To: <4C262D37.7020807@animats.com>
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
	<AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com> 
	<4C262D37.7020807@animats.com>
Message-ID: <AANLkTimr2S5U_xDOLptoUjIrWCBrfelSS4FiE_bscLQL@mail.gmail.com>

fyi - newthreading has been picked up by lwn.

 http://lwn.net/Articles/393822/#Comments
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100627/d2d3631f/attachment.html>

From greg at krypto.org  Mon Jun 28 06:33:36 2010
From: greg at krypto.org (Gregory P. Smith)
Date: Sun, 27 Jun 2010 21:33:36 -0700
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
	thread usage, and a path to getting rid of the GIL
In-Reply-To: <4C262D37.7020807@animats.com>
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
	<AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com> 
	<4C262D37.7020807@animats.com>
Message-ID: <AANLkTimr2S5U_xDOLptoUjIrWCBrfelSS4FiE_bscLQL@mail.gmail.com>

fyi - newthreading has been picked up by lwn.

 http://lwn.net/Articles/393822/#Comments
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100627/d2d3631f/attachment-0001.html>

From greg.ewing at canterbury.ac.nz  Mon Jun 28 10:28:45 2010
From: greg.ewing at canterbury.ac.nz (Greg Ewing)
Date: Mon, 28 Jun 2010 20:28:45 +1200
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100627233121.E1E0821948D@kimball.webabinitio.net>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
	<AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
	<20100627233121.E1E0821948D@kimball.webabinitio.net>
Message-ID: <4C285D3D.80907@canterbury.ac.nz>

R. David Murray wrote:

> Having such a poly_str type would probably make my life easier.

A thought on this poly_str type: perhaps it could be
called "ascii", since that's what it would have to be
restricted to, and have

   a'xxx'

as a literal syntax for it, seeing as literals seem to
be one of its main use cases.

> I also would like just vent a little frustration at having to
> use single-character-slice notation when I want to index a character
> in a string in my algorithms....

Thinking way outside the square, and probably the pale
as well, maybe @ could be pressed into service as an
infix operator, with

   s at i

being equivalent to

   s[i:i+1]

-- 
Greg

From orsenthil at gmail.com  Mon Jun 28 10:25:26 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Mon, 28 Jun 2010 13:55:26 +0530
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C285D3D.80907@canterbury.ac.nz>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
	<AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
	<20100627233121.E1E0821948D@kimball.webabinitio.net>
	<4C285D3D.80907@canterbury.ac.nz>
Message-ID: <20100628082526.GA6509@remy>

On Mon, Jun 28, 2010 at 08:28:45PM +1200, Greg Ewing wrote:
> A thought on this poly_str type: perhaps it could be
> called "ascii", since that's what it would have to be
> restricted to, and have
> 
>   a'xxx'
> 
> as a literal syntax for it, seeing as literals seem to
> be one of its main use cases.

This seems like a good idea.

> 
> Thinking way outside the square, and probably the pale
> as well, maybe @ could be pressed into service as an
> infix operator, with
> 
>   s at i
> 
> being equivalent to
> 
>   s[i:i+1]
> 

And this is way beyond being intuitive. 


-- 
Senthil

From rdmurray at bitdance.com  Mon Jun 28 13:24:48 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Mon, 28 Jun 2010 07:24:48 -0400
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <20100628082526.GA6509@remy>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
	<AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
	<20100627233121.E1E0821948D@kimball.webabinitio.net>
	<4C285D3D.80907@canterbury.ac.nz> <20100628082526.GA6509@remy>
Message-ID: <20100628112448.348771FD0CD@kimball.webabinitio.net>

On Mon, 28 Jun 2010 13:55:26 +0530, Senthil Kumaran <orsenthil at gmail.com> wrote:
> On Mon, Jun 28, 2010 at 08:28:45PM +1200, Greg Ewing wrote:
> > Thinking way outside the square, and probably the pale
> > as well, maybe @ could be pressed into service as an
> > infix operator, with
> > 
> >   s at i
> > 
> > being equivalent to
> > 
> >   s[i:i+1]
> > 
> 
> And this is way beyond being intuitive. 

Agreed, -1 on that.  Like I said, I was just venting.  The decision
to have indexing bytes return an int is set in stone now and I
just have to live with it.

--
R. David Murray                                      www.bitdance.com

From mal at egenix.com  Mon Jun 28 13:38:31 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 28 Jun 2010 13:38:31 +0200
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
Message-ID: <4C2889B7.2060105@egenix.com>

Brett Cannon wrote:
> On Sun, Jun 27, 2010 at 13:37, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
>> On Sun, Jun 27, 2010 at 1:04 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
>>> On Sun, Jun 27, 2010 at 6:46 AM, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
>>>> AC_PROG_CC is the macro that sets CFLAGS to -g -O2 on gcc-based
>>>> systems (http://www.gnu.org/software/hello/manual/autoconf/C-Compiler.html#index-AC_005fPROG_005fCC-842).
>>>> If Python's configure.in sets an otherwise-empty CFLAGS to -g before
>>>> calling AC_PROG_CC, AC_PROG_CC won't change it. Or we could just
>>>> preserve the users CFLAGS setting across AC_PROG_CC regardless of
>>>> whether it's set, to let the user set CFLAGS on the configure line
>>>> without stomping any defaults.
>>>
>>> I think saving and restoring CFLAGS across AC_PROG_CC was attempted in
>>> http://bugs.python.org/issue8211 . It turned out that it broke OS X
>>> universal builds.
>>
>> Thanks for the link to the issue. http://bugs.python.org/issue8366
>> says Ronald Oussoren fixed the universal builds without reverting the
>> CFLAGS propagation.
>>
>>> I'm not sure I understand the importance of allowing AC_PROG_CC to set
>>> CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC);  can
>>> someone give an example of why this is necessary?
>>
>> Marc-Andre's argument seems to be "it's possible that AC_PROG_CC adds
>> other flags as well (it currently doesn't, but that may well change in
>> future versions of autoconf)." That seems a little weak to constrain
>> fixing actual problems today. If it ever adds more arguments, we'll
>> need to inspect them anyway to see if they're more like -g or -O2
>> (wanted or harmful).

Please see the discussion on the ticket for details.

AC_PROG_CC provides the basic defaults for the CFLAGS compiler
settings depending on which compiler is chosen/found:

http://www.gnu.org/software/hello/manual/autoconf/C-Compiler.html

> I went ahead and reverted the change, but it does seem like the build
> environment could use a cleanup.

Thanks and, indeed, the build system environment variable usage does
need a cleanup. It's a larger project, though, and one that will likely
break existing build setups.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                20 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From ncoghlan at gmail.com  Mon Jun 28 14:13:53 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Mon, 28 Jun 2010 22:13:53 +1000
Subject: [Python-Dev] bytes / unicode
In-Reply-To: <4C285D3D.80907@canterbury.ac.nz>
References: <20100625130801.1E9A83A4099@sparrow.telecommunity.com>
	<8739wbnl0m.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100625222722.594D23A4099@sparrow.telecommunity.com>
	<87oceympcu.fsf@uwakimon.sk.tsukuba.ac.jp>
	<20100626181753.601473A4108@sparrow.telecommunity.com>
	<AANLkTinEgrBLIStlGLOd49qSgEDHOJ7PJdy6mXroPdDd@mail.gmail.com>
	<20100627034922.31A663A4108@sparrow.telecommunity.com>
	<AANLkTim8eXkLjMycMhhV2FkM2OYcQhfj6YyOkSPUCYm_@mail.gmail.com>
	<20100627233121.E1E0821948D@kimball.webabinitio.net>
	<4C285D3D.80907@canterbury.ac.nz>
Message-ID: <AANLkTikLKH7Xa1ttDYPk4otMORR-sLglTolkQI4KfYoR@mail.gmail.com>

On Mon, Jun 28, 2010 at 6:28 PM, Greg Ewing <greg.ewing at canterbury.ac.nz> wrote:
> R. David Murray wrote:
>
>> Having such a poly_str type would probably make my life easier.
>
> A thought on this poly_str type: perhaps it could be
> called "ascii", since that's what it would have to be
> restricted to, and have
>
> ?a'xxx'
>
> as a literal syntax for it, seeing as literals seem to
> be one of its main use cases.

One of the virtues of doing this as a helper type in a module
somewhere (probably string) is that we can defer that kind of decision
until later.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From dickinsm at gmail.com  Mon Jun 28 15:50:37 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Mon, 28 Jun 2010 14:50:37 +0100
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <4C2889B7.2060105@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
	<4C2889B7.2060105@egenix.com>
Message-ID: <AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>

On Mon, Jun 28, 2010 at 12:38 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> On Sun, Jun 27, 2010 at 13:37, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
>>> On Sun, Jun 27, 2010 at 1:04 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
>>>> I'm not sure I understand the importance of allowing AC_PROG_CC to set
>>>> CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC); ?can
>>>> someone give an example of why this is necessary?
>>>
>>> Marc-Andre's argument seems to be "it's possible that AC_PROG_CC adds
>>> other flags as well (it currently doesn't, but that may well change in
>>> future versions of autoconf)." That seems a little weak to constrain
>>> fixing actual problems today. If it ever adds more arguments, we'll
>>> need to inspect them anyway to see if they're more like -g or -O2
>>> (wanted or harmful).
>
> Please see the discussion on the ticket for details.

Yes, I've done that.  It's repeatedly asserted in that discussion that
AC_PROG_CC should be allowed to initialize an otherwise empty CFLAGS,
but nowhere in that discussion does it explain *why* this is
desirable.  What would be so bad about not allowing AC_PROG_CC to
initialize CFLAGS?  (E.g., by setting an otherwise empty CFLAGS to
'-g' before the AC_PROG_CC invocation.)  That would fix the issue of
the unwanted -O2 flag that AC_PROG_CC otherwise adds.

Mark

From mal at egenix.com  Mon Jun 28 16:04:04 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 28 Jun 2010 16:04:04 +0200
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>	<4C268F1E.5070506@egenix.com>	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>	<4C2889B7.2060105@egenix.com>
	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
Message-ID: <4C28ABD4.1030000@egenix.com>

Mark Dickinson wrote:
> On Mon, Jun 28, 2010 at 12:38 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>> On Sun, Jun 27, 2010 at 13:37, Jeffrey Yasskin <jyasskin at gmail.com> wrote:
>>>> On Sun, Jun 27, 2010 at 1:04 PM, Mark Dickinson <dickinsm at gmail.com> wrote:
>>>>> I'm not sure I understand the importance of allowing AC_PROG_CC to set
>>>>> CFLAGS (if CFLAGS is undefined at the point of the AC_PROG_CC);  can
>>>>> someone give an example of why this is necessary?
>>>>
>>>> Marc-Andre's argument seems to be "it's possible that AC_PROG_CC adds
>>>> other flags as well (it currently doesn't, but that may well change in
>>>> future versions of autoconf)." That seems a little weak to constrain
>>>> fixing actual problems today. If it ever adds more arguments, we'll
>>>> need to inspect them anyway to see if they're more like -g or -O2
>>>> (wanted or harmful).
>>
>> Please see the discussion on the ticket for details.
> 
> Yes, I've done that.  It's repeatedly asserted in that discussion that
> AC_PROG_CC should be allowed to initialize an otherwise empty CFLAGS,
> but nowhere in that discussion does it explain *why* this is
> desirable.  What would be so bad about not allowing AC_PROG_CC to
> initialize CFLAGS?  (E.g., by setting an otherwise empty CFLAGS to
> '-g' before the AC_PROG_CC invocation.)  That would fix the issue of
> the unwanted -O2 flag that AC_PROG_CC otherwise adds.

Why do you think that the default -O2 is unwanted and how do you know
whether the compiler accepts -g as option ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                20 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From dickinsm at gmail.com  Mon Jun 28 16:22:19 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Mon, 28 Jun 2010 15:22:19 +0100
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <4C28ABD4.1030000@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
	<4C2889B7.2060105@egenix.com>
	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
	<4C28ABD4.1030000@egenix.com>
Message-ID: <AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>

On Mon, Jun 28, 2010 at 3:04 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Why do you think that the default -O2 is unwanted

Because it can cause debug builds of Python to be built with
optimization enabled, as we've already seen at least twice.

> and how do you know
> whether the compiler accepts -g as option ?

I don't.  It could easily be tested for, though.  Alternatively,
setting an empty CFLAGS to '-g' could be done just for gcc, since this
is the only compiler for which AC_PROG_CC adds -O2.

Mark

From mal at egenix.com  Mon Jun 28 17:28:03 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 28 Jun 2010 17:28:03 +0200
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>	<4C268F1E.5070506@egenix.com>	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>	<4C2889B7.2060105@egenix.com>	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>	<4C28ABD4.1030000@egenix.com>
	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
Message-ID: <4C28BF83.9080903@egenix.com>

Mark Dickinson wrote:
> On Mon, Jun 28, 2010 at 3:04 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Why do you think that the default -O2 is unwanted
> 
> Because it can cause debug builds of Python to be built with
> optimization enabled, as we've already seen at least twice.

Then let me put it this way:

How many Python users will compile Python in debug mode ?

The point is that the default build of Python should use
the correct production settings for the C compiler out of
the box and that's what AC_PROG_CC is all about.

I'm pretty sure that Python developers who want to use a
debug build have enough code foo to get the -O2 turned into a -O0
either by adjust OPT and/or by providing their own CFLAGS env var.

Also note that in some cases you may actually want to have
a debug build with optimizations turned on, e.g. to track down
a compiler optimization bug.

>> and how do you know
>> whether the compiler accepts -g as option ?
> 
> I don't.  It could easily be tested for, though.  Alternatively,
> setting an empty CFLAGS to '-g' could be done just for gcc, since this
> is the only compiler for which AC_PROG_CC adds -O2.

... and then end up with default Python builds which don't have
debug symbols available to track down core dumps, etc. ?

AC_PROG_CC checks whether the compiler supports -g and always
uses it in that case. The option is supported by more compilers
than just GCC. E.g. IBM's xlC and Intel's icl compilers support
that option as well.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                20 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From mal at egenix.com  Mon Jun 28 17:31:40 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 28 Jun 2010 17:31:40 +0200
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <4C28BF83.9080903@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>	<4C268F1E.5070506@egenix.com>	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>	<4C2889B7.2060105@egenix.com>	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>	<4C28ABD4.1030000@egenix.com>	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
	<4C28BF83.9080903@egenix.com>
Message-ID: <4C28C05C.80008@egenix.com>

M.-A. Lemburg wrote:
> Mark Dickinson wrote:
>> On Mon, Jun 28, 2010 at 3:04 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>> Why do you think that the default -O2 is unwanted
>>
>> Because it can cause debug builds of Python to be built with
>> optimization enabled, as we've already seen at least twice.
> 
> Then let me put it this way:
> 
> How many Python users will compile Python in debug mode ?
> 
> The point is that the default build of Python should use
> the correct production settings for the C compiler out of
> the box and that's what AC_PROG_CC is all about.
> 
> I'm pretty sure that Python developers who want to use a
> debug build have enough code foo to get the -O2 turned into a -O0
> either by adjust OPT and/or by providing their own CFLAGS env var.
> 
> Also note that in some cases you may actually want to have
> a debug build with optimizations turned on, e.g. to track down
> a compiler optimization bug.
> 
>>> and how do you know
>>> whether the compiler accepts -g as option ?
>>
>> I don't.  It could easily be tested for, though.  Alternatively,
>> setting an empty CFLAGS to '-g' could be done just for gcc, since this
>> is the only compiler for which AC_PROG_CC adds -O2.
> 
> ... and then end up with default Python builds which don't have
> debug symbols available to track down core dumps, etc. ?
> 
> AC_PROG_CC checks whether the compiler supports -g and always
> uses it in that case. The option is supported by more compilers
> than just GCC. E.g. IBM's xlC and Intel's icl compilers support
> that option as well.

Sorry, Intel's compiler is called "icc", not "icl":

http://software.intel.com/sites/products/documentation/hpc/compilerpro/en-us/cpp/mac/man/icc.txt

IBM's compiler:

http://publib.boulder.ibm.com/infocenter/macxhelp/v6v81/index.jsp?topic=/com.ibm.vacpp6m.doc/compiler/ref/ruoptlst.htm

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                20 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From guido at python.org  Mon Jun 28 17:39:22 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Jun 2010 08:39:22 -0700
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
	thread usage, and a path to getting rid of the GIL
In-Reply-To: <AANLkTimr2S5U_xDOLptoUjIrWCBrfelSS4FiE_bscLQL@mail.gmail.com>
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
	<AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com> 
	<4C262D37.7020807@animats.com>
	<AANLkTimr2S5U_xDOLptoUjIrWCBrfelSS4FiE_bscLQL@mail.gmail.com>
Message-ID: <AANLkTinoBtOL4Qoa4POi48iSTBqRg861qqUwtmEBriZb@mail.gmail.com>

On Sun, Jun 27, 2010 at 9:33 PM, Gregory P. Smith <greg at krypto.org> wrote:
> fyi - newthreading has been picked up by lwn.
>  http://lwn.net/Articles/393822/#Comments

Do you know if any of the commenters is Nagle himself (and if so,
which)? The discussion is hard to follow since the context of replies
isn't always clear. There also seems to be a bunch of C++ thinking
(and some knee-jerk responses by people who aren't actually all that
familiar with Python) although I admit I don't have much of an
intuition about memory models for fully free threading myself. It's a
brave new world...

--Guido

-- 
--Guido van Rossum (python.org/~guido)

From dickinsm at gmail.com  Mon Jun 28 17:44:00 2010
From: dickinsm at gmail.com (Mark Dickinson)
Date: Mon, 28 Jun 2010 16:44:00 +0100
Subject: [Python-Dev] what environment variable should contain compiler
	warning suppression flags?
In-Reply-To: <4C28BF83.9080903@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
	<4C2889B7.2060105@egenix.com>
	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
	<4C28ABD4.1030000@egenix.com>
	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
	<4C28BF83.9080903@egenix.com>
Message-ID: <AANLkTil8Dg3RkDXq1VmPy5kTDXSMVqR3WQ4iy2A7F80L@mail.gmail.com>

On Mon, Jun 28, 2010 at 4:28 PM, M.-A. Lemburg <mal at egenix.com> wrote:
> Mark Dickinson wrote:
>> On Mon, Jun 28, 2010 at 3:04 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>> Why do you think that the default -O2 is unwanted
>>
>> Because it can cause debug builds of Python to be built with
>> optimization enabled, as we've already seen at least twice.
>
> Then let me put it this way:
>
> How many Python users will compile Python in debug mode ?
>
> The point is that the default build of Python should use
> the correct production settings for the C compiler out of
> the box and that's what AC_PROG_CC is all about.
>
> I'm pretty sure that Python developers who want to use a
> debug build have enough code foo to get the -O2 turned into a -O0
> either by adjust OPT and/or by providing their own CFLAGS env var.

Shrug.  Clearly someone at some point in the past thought it was a
good idea to have --with-pydebug builds use -O0.  If there's going to
be a deliberate decision to drop that now, then that's fine with me.

>> I don't. ?It could easily be tested for, though. ?Alternatively,
>> setting an empty CFLAGS to '-g' could be done just for gcc, since this
>> is the only compiler for which AC_PROG_CC adds -O2.
>
> ... and then end up with default Python builds which don't have
> debug symbols available to track down core dumps, etc. ?

No, I don't see how that follows.  I was suggesting that *for gcc
only*, an empty CFLAGS be set to '-g' before calling AC_PROG_CC.  The
*only* effect this would have would be that for gcc, if the user
hasn't specified CFLAGS, then CFLAGS ends up being '-g' rather than
'-g -O2' after the AC_PROG_CC call. But I'm really not looking for an
argument here;  I just wanted to understand why you thought AC_PROG_CC
setting CFLAGS was important, and you've explained that.  Thanks.

Mark

From mal at egenix.com  Mon Jun 28 18:03:23 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Mon, 28 Jun 2010 18:03:23 +0200
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <AANLkTil8Dg3RkDXq1VmPy5kTDXSMVqR3WQ4iy2A7F80L@mail.gmail.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>	<4C268F1E.5070506@egenix.com>	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>	<4C2889B7.2060105@egenix.com>	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>	<4C28ABD4.1030000@egenix.com>	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>	<4C28BF83.9080903@egenix.com>
	<AANLkTil8Dg3RkDXq1VmPy5kTDXSMVqR3WQ4iy2A7F80L@mail.gmail.com>
Message-ID: <4C28C7CB.8030600@egenix.com>

Mark Dickinson wrote:
> On Mon, Jun 28, 2010 at 4:28 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>> Mark Dickinson wrote:
>>> On Mon, Jun 28, 2010 at 3:04 PM, M.-A. Lemburg <mal at egenix.com> wrote:
>>>> Why do you think that the default -O2 is unwanted
>>>
>>> Because it can cause debug builds of Python to be built with
>>> optimization enabled, as we've already seen at least twice.
>>
>> Then let me put it this way:
>>
>> How many Python users will compile Python in debug mode ?
>>
>> The point is that the default build of Python should use
>> the correct production settings for the C compiler out of
>> the box and that's what AC_PROG_CC is all about.
>>
>> I'm pretty sure that Python developers who want to use a
>> debug build have enough code foo to get the -O2 turned into a -O0
>> either by adjust OPT and/or by providing their own CFLAGS env var.
> 
> Shrug.  Clearly someone at some point in the past thought it was a
> good idea to have --with-pydebug builds use -O0.  If there's going to
> be a deliberate decision to drop that now, then that's fine with me.

Ah right, the time machine again :-)

OPT already uses -O0 if --with-pydebug is used and the
compiler supports -g. Since OPT gets added after CFLAGS, the override
already happens...

>>> I don't.  It could easily be tested for, though.  Alternatively,
>>> setting an empty CFLAGS to '-g' could be done just for gcc, since this
>>> is the only compiler for which AC_PROG_CC adds -O2.
>>
>> ... and then end up with default Python builds which don't have
>> debug symbols available to track down core dumps, etc. ?
> 
> No, I don't see how that follows.  I was suggesting that *for gcc
> only*, an empty CFLAGS be set to '-g' before calling AC_PROG_CC.  The
> *only* effect this would have would be that for gcc, if the user
> hasn't specified CFLAGS, then CFLAGS ends up being '-g' rather than
> '-g -O2' after the AC_PROG_CC call. But I'm really not looking for an
> argument here;  I just wanted to understand why you thought AC_PROG_CC
> setting CFLAGS was important, and you've explained that.  Thanks.

Sorry, that was a misunderstand on my part.

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 28 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                20 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From techtonik at gmail.com  Mon Jun 28 18:05:13 2010
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 28 Jun 2010 19:05:13 +0300
Subject: [Python-Dev] WPython 1.1 was released
In-Reply-To: <AANLkTilPBsYFOKWmpq5yXfSwzbAThT96DHKNOq-3Z6Uo@mail.gmail.com>
References: <AANLkTilRhQseGNZ7jB8noc7akkkbP0gPErSuoExqij2e@mail.gmail.com>
	<201006232112.41047.steve@pearwood.info>
	<AANLkTik-L_dmJnhS81jgUO6wHYu-73-twkrYjFgloHyf@mail.gmail.com>
	<hvtgpu$qoh$1@dough.gmane.org>
	<AANLkTilPBsYFOKWmpq5yXfSwzbAThT96DHKNOq-3Z6Uo@mail.gmail.com>
Message-ID: <AANLkTimlJCVk_SVSGbPBL1PNhjEkhUoiTsoj1qGr5HZX@mail.gmail.com>

It would be interesting to see benchmark diagrams inline on one page
with overall summaries. I've posted a enhancement to
http://code.google.com/p/unladen-swallow/issues/detail?id=145 if
somebody is going to look at that. I wonder if 32bit version can bring
more speedups?
-- 
anatoly t.

From techtonik at gmail.com  Mon Jun 28 20:09:56 2010
From: techtonik at gmail.com (anatoly techtonik)
Date: Mon, 28 Jun 2010 21:09:56 +0300
Subject: [Python-Dev] Pickle security and remote logging
Message-ID: <AANLkTinvmLy1UaOODajeiD3ySn43cIQIPvnZHY4Yi_PS@mail.gmail.com>

Hello,

I need to send logging module output over the network. The module has
everything to make this happen, except security. SocketHandler and
DatagramHandler examples are using pickle module that is said to be
insecure. SocketHandler and DatagramHandler docs should at least
contain a warning about danger of exposing unpickling interfaces to
insecure networks.

pickle documentation mentions that it is possible to control what gets
unpickled, but there is any no example or security analysis if the
proposed solution will be secure. Is there any way to implement secure
network logging? I do not care about data encryption - I just do not
want my server exploited by malformed data.

-- 
anatoly t.

From zohair_ms at hotmail.com  Mon Jun 28 20:09:35 2010
From: zohair_ms at hotmail.com (Zohair)
Date: Mon, 28 Jun 2010 11:09:35 -0700 (PDT)
Subject: [Python-Dev]  Access a function
Message-ID: <29008798.post@talk.nabble.com>


I am a very new to python and have a small question..

I have a function:
set_time_at_next_pps(self, *args, **kwargs) but don't know how to use it...
Askign for your help please.

Cheers,

Zoh
-- 
View this message in context: http://old.nabble.com/Access-a-function-tp29008798p29008798.html
Sent from the Python - python-dev mailing list archive at Nabble.com.


From fuzzyman at voidspace.org.uk  Mon Jun 28 20:39:08 2010
From: fuzzyman at voidspace.org.uk (Michael Foord)
Date: Mon, 28 Jun 2010 19:39:08 +0100
Subject: [Python-Dev] Access a function
In-Reply-To: <29008798.post@talk.nabble.com>
References: <29008798.post@talk.nabble.com>
Message-ID: <4C28EC4C.7030905@voidspace.org.uk>

On 28/06/2010 19:09, Zohair wrote:
> I am a very new to python and have a small question..
>
> I have a function:
> set_time_at_next_pps(self, *args, **kwargs) but don't know how to use it...
> Askign for your help please.
>    

Hi Zoh,

This mailing list is for the development *of* Python, not for questions 
about developing *with* Python. You should ask your question on a 
mailing list / newsgroup like python-list or python-tutor. python-list 
is available via google groups:

https://groups.google.com/group/comp.lang.python/topics

You haven't given enough information to answer the question however. The 
first argument 'self' means that the function is probably a method of a 
class, and should be called from a class instance. The *args / **kwargs 
means that the function can take any number of arguments or keyword 
arguments, which doesn't tell us anything about the function should be 
used.

You can find out more on Python functions in the tutorial:

http://docs.python.org/tutorial/controlflow.html#more-on-defining-functions

All the best,

Michael Foord

> Cheers,
>
> Zoh
>    


-- 
http://www.ironpythoninaction.com/
http://www.voidspace.org.uk/blog

READ CAREFULLY. By accepting and reading this email you agree, on behalf of your employer, to release me from all obligations and waivers arising from any and all NON-NEGOTIATED agreements, licenses, terms-of-service, shrinkwrap, clickwrap, browsewrap, confidentiality, non-disclosure, non-compete and acceptable use policies (?BOGUS AGREEMENTS?) that I have entered into with your employer, its partners, licensors, agents and assigns, in perpetuity, without prejudice to my ongoing rights and privileges. You further represent that you have the authority to release me from any BOGUS AGREEMENTS on behalf of your employer.


From phd at phd.pp.ru  Mon Jun 28 20:42:28 2010
From: phd at phd.pp.ru (Oleg Broytman)
Date: Mon, 28 Jun 2010 22:42:28 +0400
Subject: [Python-Dev] Access a function
In-Reply-To: <29008798.post@talk.nabble.com>
References: <29008798.post@talk.nabble.com>
Message-ID: <20100628184228.GA17475@phd.pp.ru>

Hello.

   We'are sorry but we cannot help you. This mailing list is to work on
developing Python (fixing bugs and adding new features to Python itself); if
you're having problems using Python, please find another forum. Probably
python-list (comp.lang.python) news group/mailing list is the best place.
See http://www.python.org/community/lists/ for other lists/news groups/fora.
Thank you for understanding.

On Mon, Jun 28, 2010 at 11:09:35AM -0700, Zohair wrote:
> 
> I am a very new to python and have a small question..
> 
> I have a function:
> set_time_at_next_pps(self, *args, **kwargs) but don't know how to use it...
> Askign for your help please.
> 
> Cheers,
> 
> Zoh
> -- 
> View this message in context: http://old.nabble.com/Access-a-function-tp29008798p29008798.html
> Sent from the Python - python-dev mailing list archive at Nabble.com.
> 
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/phd%40phd.pp.ru

Oleg.
-- 
     Oleg Broytman            http://phd.pp.ru/            phd at phd.pp.ru
           Programmers don't die, they just GOSUB without RETURN.

From alexander.belopolsky at gmail.com  Mon Jun 28 21:59:00 2010
From: alexander.belopolsky at gmail.com (Alexander Belopolsky)
Date: Mon, 28 Jun 2010 15:59:00 -0400
Subject: [Python-Dev] How to spell PyInstance_NewRaw in py3k?
Message-ID: <AANLkTimElR1bkDMLIB2h7NlVp3okjmTkjjoucHVg8NOW@mail.gmail.com>

Issue #5180 [1] presented an interesting challenge: how to unpickle
instances of old-style classes when a pickle created with 2.x is
loaded in 3.x python?  The problem is that pickle protocol requires
that unpickled instances be created without calling the __init__
method.   This is necessary because pickle file may not contain
information about how __init__ method should be invoked.  Instead,
implementations are required to bypass  __init__ and populate
instance's __dict__ directly using data found in the pickle.

Pure python implementation uses the following trick that happens to work in 3.x:

class Empty:
    pass

pickled = Empty()
pickled.__class__ = Pickled

This of course, creates a new-style class in 3.x, but if 3.x version
of Pickled behaves similarly to its 2.x predecessor, it should work.

The cPickle implementation, on the other hand uses 2.x C API which is
not available in 3.x.  Namely, the PyInstance_NewRaw function.  In
order to fix the bug described in issue #5180, I had to emulate
PyInstance_NewRaw using type->tp_alloc.  I considered an rejected the
idea to use tp_new instead. [2]

Is this the right way to proceed?  The patch is attached to the issue. [3]


[1] http://bugs.python.org/issue5180
[2] http://bugs.python.org/issue5180#msg108846
[3] http://bugs.python.org/file17792/issue5180.diff

From lvh at laurensvh.be  Mon Jun 28 23:33:05 2010
From: lvh at laurensvh.be (Laurens Van Houtven)
Date: Mon, 28 Jun 2010 23:33:05 +0200
Subject: [Python-Dev] Access a function
In-Reply-To: <20100628184228.GA17475@phd.pp.ru>
References: <29008798.post@talk.nabble.com> <20100628184228.GA17475@phd.pp.ru>
Message-ID: <AANLkTikfMcNpu1WgLJojpF9Ty8qWT4-E-mdJfIWJeZdq@mail.gmail.com>

Of course I concur with the two posters above me, but in order to
advertise for my own shop... If you're stuck with a lot of newbie
questions like these you might want to try #python (the IRC channel on
irc.freenode.net). You're more likely to get quick successive
responses there than on other media (which are more suitable for
bigger, more complex questions).


cheers
Laurens

From guido at python.org  Tue Jun 29 01:09:55 2010
From: guido at python.org (Guido van Rossum)
Date: Mon, 28 Jun 2010 16:09:55 -0700
Subject: [Python-Dev] [ANN]: "newthreading" - an approach to simplified
	thread usage, and a path to getting rid of the GIL
In-Reply-To: <4C262D37.7020807@animats.com>
References: <4C259A25.1060705@animats.com> <4C2600B4.5020503@voidspace.org.uk>
	<AANLkTikpDB4FsFCESF2Ub0ZhXJiZyERZF6zLAjUovcr8@mail.gmail.com> 
	<4C262D37.7020807@animats.com>
Message-ID: <AANLkTinj9h4W_1kj8_7YyOO2aRlPJi7eljKzf1Idyhx7@mail.gmail.com>

I'm moving this thread to python-ideas, where it belongs.

I've looked at the implementation code (even stepped through it with
pdb!), read the sample/test code, and read the two papers on
animats.com fairly closely (they have a lot of overlap, and the memory
model described below seems copied verbatim from
http://www.animats.com/papers/languages/pythonconcurrency.html version
0.8).

Some reactions (trying to hide my responses to the details of the code):

- First of all, I'm very happy to see radical ideas proposed, even if
they are at present unrealistic. We need a big brainstorm to come up
with ideas from which an eventual solution to the multicore problem
might be chosen. (Jesse Noller's multiprocessing is another; Adam
Olsen's work yet another, at a different end of the spectrum.)

- The proposed new semantics (frozen objects, memory model,
auto-freezing of globals, enforcement of naming conventions) are
radically different from Python's current semantics. They will break
every 3rd party library in many more ways than Python 3. This is not
surprising given the goals of the proposal (and its roots in Adam
Olsen's work) but places a huge roadblock for acceptance. I see no
choice but to keep trying to come up with a compromise that is more
palatable and compatible without throwing away all the advantages. As
it now stands, the proposal might as well be a new and different
language.

- SynchronizedObject looks like a mixture of a Java synchronized class
(a non-standard concept in Java but easily understood as a class all
whose public methods are synchronized) and a condition variable (which
has the same semantics of releasing the lock while waiting but without
crawling the stack for other locks to release). It looks like the
examples showing off SynchronizedObject could be implemented just as
elegantly using a condition variable (and voluntary abstention from
using shared mutable objects).

- If the goal is to experiment with new control structures, I
recommend decoupling them from the memory model and frozen objects,
instead relying (as is traditional in Python) on programmer caution to
avoid races. This would make it much easier to see how programmers
respond to the new control structures.

- You could add the freeze() function for voluntary use, and you could
even add automatic wrapping of arguments and return values for certain
classes using a class decorator or a metaclass, but the performance
overhead makes this unlikely to win over many converts. I don't see
much use for the "whole program freezing" done by the current
prototype -- there are way too many backdoors in Python for the
prototype approach to be anywhere near foolproof, and if we want a
non-foolproof approach, voluntary constraint (and, in some cases,
voluntary, i.e. explicit, wrapping of modules or classes) would work
just as well.

- For a larger-scale experiment with the new memory model and semantic
restrictions (or would it be better to call them syntactic
restrictions? -- after all they are about statically detectable
properties like naming conventions) I recommend looking at PyPy, which
has as one of its explicitly stated project goals easy experimentation
with different object models.

- I'm sure I've forgotten something, but I wanted to keep my impressions fresh.

- Again, John, thanks for taking the time to come up with an
implementation of your idea!

--Guido

On Sat, Jun 26, 2010 at 9:39 AM, John Nagle <nagle at animats.com> wrote:
> On 6/26/2010 7:44 AM, Jesse Noller wrote:
>>
>> On Sat, Jun 26, 2010 at 9:29 AM, Michael Foord
>> <fuzzyman at voidspace.org.uk> ?wrote:
>>>
>>> On 26/06/2010 07:11, John Nagle wrote:
>>>>
>>>> We have just released a proof-of-concept implementation of a new
>>>> approach to thread management - "newthreading".
>
> ....
>
>>> The import * form is considered bad practise in *general* and
>>> should not be recommended unless there is a good reason.
>
> ? I agree. ?I just did that to make the examples cleaner.
>
>>> however the introduction of free-threading in Python has not been
>>> hampered by lack of synchronization primitives but by the
>>> difficulty of changing the interpreter without unduly impacting
>>> single threaded code.
>
> ? ?That's what I'm trying to address here.
>
>>> Providing an alternative garbage collection mechanism other than
>>> reference counting would be a more interesting first-step as far as
>>> I can see, as that removes the locking required around every access
>>> to an object (which currently touches the reference count).
>>> Introducing free-threading by *changing* the threading semantics
>>> (so you can't share non-frozen objects between threads) would not
>>> be acceptable. That comment is likely to be based on a
>>> misunderstanding of your future intentions though. :-)
>
> ? ?This work comes out of a discussion a few of us had at a restaurant
> in Palo Alto after a Stanford talk by the group at Facebook which
> is building a JIT compiler for PHP. ?We were discussing how to
> make threading both safe for the average programmer and efficient.
> Javascript and PHP don't have threads at all; Python has safe
> threading, but it's slow. ?C/C++/Java all have race condition
> problems, of course. ?The Facebook guy pointed out that you
> can't redefine a function dynamically in PHP, and they get
> a performance win in their JIT by exploiting this.
>
> ? ?I haven't gone into the memory model in enough detail in the
> technical paper. ?The memory model I envision for this has three
> memory zones:
>
> ? ?1. ?Shared fully-immutable objects: primarily strings, numbers,
> and tuples, all of whose elements are fully immutable. ?These can
> be shared without locking, and reclaimed by a concurrent garbage
> collector like Boehm's. ?They have no destructors, so finalization
> is not an issue.
>
> ? ?2. ?Local objects. ?These are managed as at present, and
> require no locking. ?These can either be thread-local, or local
> to a synchronized object. ?There are no links between local
> objects under different "ownership". ?Whether each thread and
> object has its own private heap, or whether there's a common heap with
> locks at the allocator is an implementation decision.
>
> ? ?3. ?Shared mutable objects: mostly synchronized objects, but
> also immutable objects like tuples which contain references
> to objects that aren't fully immutable. ?These are the high-overhead
> objects, and require locking during reference count updates, or
> atomic reference count operations if supported by the hardware.
> The general idea is to minimize the number of objects in this
> zone.
>
> ? ?The zone of an object is determined when the object is created,
> and never changes. ? This is relatively simple to implement.
> Tuples (and frozensets, frozendicts, etc.) are normally zone 2
> objects. ?Only "freeze" creates collections in zones 1 and 3.
> Synchronized objects are always created in zone 3.
> There are no difficult handoffs, where an object that was previously
> thread-local now has to be shared and has to acquire locks during
> the transition.
>
> ? ?Existing interlinked data structures, like parse trees and GUIs,
> are by default zone 2 objects, with the same semantics as at
> present. ?They can be placed inside a SynchronizedObject if
> desired, which makes them usable from multiple threads.
> That's optional; they're thread-local otherwise.
>
> ? ?The rationale behind "freezing" some of the language semantics
> when the program goes multi-thread comes from two sources -
> Adam Olsen's Safethread work, and the acceptance of the
> multiprocessing module. ?Olsen tried to retain all the dynamism of
> the language in a multithreaded environment, but locking all the
> underlying dictionaries was a boat-anchor on the whole system,
> and slowed things down so much that he abandoned the project.
> The Unladen Swallow documentation indicates that early thinking
> on the project was that Olsen's approach would allow getting
> rid of the GIL, but later notes indicate that no path to a
> GIL-free JIT system is currently in development.
>
> ? ?The multiprocessing module provides semantics similar to
> threading with "freezing". ?Data passed between processes is "frozen"
> by pickling. ?Processes can't modify each other's code. ?Restrictive
> though the multiprocessing module is, it appears to be useful.
> It is sometimes recommended as the Pythonic approach to multi-core CPUs.
> This is an indication that "freezing" is not unacceptable to the
> user community.
>
> ? ?Most of the real-world use cases for extreme dynamism
> involve events that happen during startup. ?Configuration files are
> read, modules are selectively included, functions are overridden, tables
> of references to functions are set up, regular expressions are compiled,
> and the code is brought into the appropriately configured state. ?Then
> the worker threads are started and the real work starts. The
> "newthreading" approach allows all that.
>
> ? ?After two decades of failed attempts remove the Global
> Interpreter Lock without making performance worse, it is perhaps
> time to take a harder look at scaleable threading semantics.
>
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?John Nagle
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?Animats

-- 
--Guido van Rossum (python.org/~guido)

From steve at holdenweb.com  Tue Jun 29 15:56:11 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 09:56:11 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
Message-ID: <i0cu24$u02$1@dough.gmane.org>

I hope this is an appropriate dev topic.

It seems to me that the unicode discussions of recent days are well
highlighted by difficulties I am having using the mailbox module (hardly
surprising given the difficulties of handling email generally) even
though it passes its tests.

I can't find anything related in the issue tracker (symptoms: one
program that works fine under Python 2 in under twenty seconds takes
forever (over ten minutes) to fail while creating the (start, stop)
index to the mailbox). My code reads Thunderbird mailboxen from file
store on my Windows Vista system under 3.1.

The failures I am experiencing could easily be encoding issues so I
won't post any detail yet, but I am concerned about the timing - even
when the code is "fixed", if it needs to be, the performance may still
make the module of dubious value.

Can someone who is set up to do easily just do a timing of test_mailbox
under 2.6 and 3.2, to verify they see the same disparity as me? The test
takes about twice as long under 3.1 here (and I am concerned that
unexercised aspects of the code may extend real-world problem run times
by an order of magnitude or more).

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From miki.tebeka at gmail.com  Tue Jun 29 16:10:20 2010
From: miki.tebeka at gmail.com (Miki Tebeka)
Date: Tue, 29 Jun 2010 07:10:20 -0700
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <i0cu24$u02$1@dough.gmane.org>
References: <i0cu24$u02$1@dough.gmane.org>
Message-ID: <AANLkTilLj5EDH1N9CJagz7_hYAtTT5esUh2hfpoy6UCt@mail.gmail.com>

Hello Steve,

> Can someone who is set up to do easily just do a timing of test_mailbox
> under 2.6 and 3.2, to verify they see the same disparity as me? The test
> takes about twice as long under 3.1 here
On Ubuntu timing was:

Python 2.6.5:  23.8sec
Python 2.7rc2: 32.7sec
Python 3.1.2:  32.3sec

All the best,
--
Miki

From orsenthil at gmail.com  Tue Jun 29 16:11:20 2010
From: orsenthil at gmail.com (Senthil Kumaran)
Date: Tue, 29 Jun 2010 19:41:20 +0530
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <i0cu24$u02$1@dough.gmane.org>
References: <i0cu24$u02$1@dough.gmane.org>
Message-ID: <20100629141120.GA7448@remy>

On Tue, Jun 29, 2010 at 09:56:11AM -0400, Steve Holden wrote:
> Can someone who is set up to do easily just do a timing of test_mailbox
> under 2.6 and 3.2, to verify they see the same disparity as me? The test

Actually, No.

Python 2.7b2+ (trunk:81685M, Jun  4 2010, 21:52:06) 
Ran 274 tests in 27.231s

OK

real    0m27.769s
user    0m1.110s
sys     0m0.440s

Python 3.2a0 (py3k:82364M, Jun 29 2010, 19:37:27

Ran 268 tests in 24.444s

OK

real    0m25.126s
user    0m2.810s
sys     0m0.270s
07:39 PM:senthil@:~/python/py3k

This is under Ubuntu 64 Bit.
Perhaps, the problem you are observing is Windows Only?

-- 
Senthil

Banectomy, n.:
	The removal of bruises on a banana.
		-- Rich Hall, "Sniglets"

From ncoghlan at gmail.com  Tue Jun 29 16:14:31 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Wed, 30 Jun 2010 00:14:31 +1000
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <i0cu24$u02$1@dough.gmane.org>
References: <i0cu24$u02$1@dough.gmane.org>
Message-ID: <AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>

Command line: ./python -m test.regrtest -v test_mailbox

trunk: Ran 274 tests in 25.239s
py3k: Ran 268 tests in 26.263s

So I don't see any substantial difference on a Kubuntu 10.04 box (both
builds are recent'ish, but not completely up to date).

However, the underlying IO access is significantly different between
POSIX and Windows, so there could still be something pathological
happening at the filesystem manipulation layer. My comparisons are
also 2.7 vs 3.2 rather than 2.6 vs 3.1.

Cheers,
Nick.

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From steve at holdenweb.com  Tue Jun 29 16:26:28 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 10:26:28 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
Message-ID: <4C2A0294.3070806@holdenweb.com>

Nick Coghlan wrote:
> Command line: ./python -m test.regrtest -v test_mailbox
> 
> trunk: Ran 274 tests in 25.239s
> py3k: Ran 268 tests in 26.263s
> 
> So I don't see any substantial difference on a Kubuntu 10.04 box (both
> builds are recent'ish, but not completely up to date).
> 
> However, the underlying IO access is significantly different between
> POSIX and Windows, so there could still be something pathological
> happening at the filesystem manipulation layer. My comparisons are
> also 2.7 vs 3.2 rather than 2.6 vs 3.1.
> 
> Cheers,
> Nick.
> 
Thanks for all the timings! If a Windows user could do the same thing
that would help ...

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From steve at holdenweb.com  Tue Jun 29 16:49:00 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 10:49:00 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <4C2A0294.3070806@holdenweb.com>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com>
Message-ID: <i0d154$b1t$1@dough.gmane.org>

Steve Holden wrote:
> Nick Coghlan wrote:
>> Command line: ./python -m test.regrtest -v test_mailbox
>>
>> trunk: Ran 274 tests in 25.239s
>> py3k: Ran 268 tests in 26.263s
>>
>> So I don't see any substantial difference on a Kubuntu 10.04 box (both
>> builds are recent'ish, but not completely up to date).
>>
>> However, the underlying IO access is significantly different between
>> POSIX and Windows, so there could still be something pathological
>> happening at the filesystem manipulation layer. My comparisons are
>> also 2.7 vs 3.2 rather than 2.6 vs 3.1.
>>
>> Cheers,
>> Nick.
>>
> Thanks for all the timings! If a Windows user could do the same thing
> that would help ...
> 
And there is *definitely a performance issue. I created a Thunderbird
folder of 26 Google alerts and just parsed then all after reading them
in from the mailbox.

2.5 (!):  0.78 sec
3.1    : 42.80 sec

Rather than debate the code here perhaps I should just open an issue for
this? I can then provide both a program and some data, which can be
added to the tests if appropriate. The issue can clearly stand some
investigation.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From barry at python.org  Tue Jun 29 16:50:12 2010
From: barry at python.org (Barry Warsaw)
Date: Tue, 29 Jun 2010 10:50:12 -0400
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <4C28BF83.9080903@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
	<4C2889B7.2060105@egenix.com>
	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
	<4C28ABD4.1030000@egenix.com>
	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
	<4C28BF83.9080903@egenix.com>
Message-ID: <20100629105012.341adc7b@heresy>

On Jun 28, 2010, at 05:28 PM, M.-A. Lemburg wrote:

>How many Python users will compile Python in debug mode ?

How many Python users compile Python at all? :)

>The point is that the default build of Python should use
>the correct production settings for the C compiler out of
>the box and that's what AC_PROG_CC is all about.

Sure.

>I'm pretty sure that Python developers who want to use a
>debug build have enough code foo to get the -O2 turned into a -O0
>either by adjust OPT and/or by providing their own CFLAGS env var.

Yes, but it's a PITA for several reasons, IMO:

* It's pretty underdocumented
* It's obscure
* It's hard to remember the exact fu needed because you do it infrequently
* I usually only remember my mistake when gdb acts funny

I strongly suggest that --with-pydebug should be all you need to ensure the
best debugging environment, which means turning off compiler optimization.
Last time I tried, the -O0 was added and it worked well.  (I know this has
been in flux though.)

>Also note that in some cases you may actually want to have
>a debug build with optimizations turned on, e.g. to track down
>a compiler optimization bug.

Yes, but that's *much* more rare than wanting to step through some bit of C
code without going crazy.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100629/5b97110c/attachment.pgp>

From mail at timgolden.me.uk  Tue Jun 29 16:51:00 2010
From: mail at timgolden.me.uk (Tim Golden)
Date: Tue, 29 Jun 2010 15:51:00 +0100
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <4C2A0294.3070806@holdenweb.com>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com>
Message-ID: <4C2A0854.5060004@timgolden.me.uk>

On 29/06/2010 15:26, Steve Holden wrote:
> Nick Coghlan wrote:
>> Command line: ./python -m test.regrtest -v test_mailbox
>>
>> trunk: Ran 274 tests in 25.239s
>> py3k: Ran 268 tests in 26.263s
>>
>> So I don't see any substantial difference on a Kubuntu 10.04 box (both
>> builds are recent'ish, but not completely up to date).
>>
>> However, the underlying IO access is significantly different between
>> POSIX and Windows, so there could still be something pathological
>> happening at the filesystem manipulation layer. My comparisons are
>> also 2.7 vs 3.2 rather than 2.6 vs 3.1.
>>
>> Cheers,
>> Nick.
>>
> Thanks for all the timings! If a Windows user could do the same thing
> that would help ...

WinXP SP3

2.6 Ran 272 tests in 13.172s
3.1 Ran 267 tests in 15.735s
py3k A *lot* of ERROR and FAIL tests

WinXP SP3

TJG

From barry at python.org  Tue Jun 29 16:51:35 2010
From: barry at python.org (Barry Warsaw)
Date: Tue, 29 Jun 2010 10:51:35 -0400
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <4C28C7CB.8030600@egenix.com>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
	<4C2889B7.2060105@egenix.com>
	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
	<4C28ABD4.1030000@egenix.com>
	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
	<4C28BF83.9080903@egenix.com>
	<AANLkTil8Dg3RkDXq1VmPy5kTDXSMVqR3WQ4iy2A7F80L@mail.gmail.com>
	<4C28C7CB.8030600@egenix.com>
Message-ID: <20100629105135.245bf5d7@heresy>

On Jun 28, 2010, at 06:03 PM, M.-A. Lemburg wrote:

>OPT already uses -O0 if --with-pydebug is used and the
>compiler supports -g. Since OPT gets added after CFLAGS, the override
>already happens...

So nobody's proposing to drop that?  Good!  Ignore my last message then. :)

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100629/d7e67f73/attachment.pgp>

From guido at python.org  Tue Jun 29 16:56:22 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Jun 2010 07:56:22 -0700
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <i0d154$b1t$1@dough.gmane.org>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com> 
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
Message-ID: <AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>

On Tue, Jun 29, 2010 at 7:49 AM, Steve Holden <steve at holdenweb.com> wrote:
> Steve Holden wrote:
>> Nick Coghlan wrote:
>>> Command line: ./python -m test.regrtest -v test_mailbox
>>>
>>> trunk: Ran 274 tests in 25.239s
>>> py3k: Ran 268 tests in 26.263s
>>>
>>> So I don't see any substantial difference on a Kubuntu 10.04 box (both
>>> builds are recent'ish, but not completely up to date).
>>>
>>> However, the underlying IO access is significantly different between
>>> POSIX and Windows, so there could still be something pathological
>>> happening at the filesystem manipulation layer. My comparisons are
>>> also 2.7 vs 3.2 rather than 2.6 vs 3.1.
>>>
>>> Cheers,
>>> Nick.
>>>
>> Thanks for all the timings! If a Windows user could do the same thing
>> that would help ...
>>
> And there is *definitely a performance issue. I created a Thunderbird
> folder of 26 Google alerts and just parsed then all after reading them
> in from the mailbox.
>
> 2.5 (!): ?0.78 sec
> 3.1 ? ?: 42.80 sec
>
> Rather than debate the code here perhaps I should just open an issue for
> this? I can then provide both a program and some data, which can be
> added to the tests if appropriate. The issue can clearly stand some
> investigation.

Since you have such a great reproducible test case, could you point
the profiler at it? (Perhaps on a reduced dataset... The profiler
multiples your run time by some number between 2 and 10 IIRC.)

-- 
--Guido van Rossum (python.org/~guido)

From mail at timgolden.me.uk  Tue Jun 29 17:04:48 2010
From: mail at timgolden.me.uk (Tim Golden)
Date: Tue, 29 Jun 2010 16:04:48 +0100
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <4C2A0854.5060004@timgolden.me.uk>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>	<4C2A0294.3070806@holdenweb.com>
	<4C2A0854.5060004@timgolden.me.uk>
Message-ID: <4C2A0B90.9020705@timgolden.me.uk>

On 29/06/2010 15:51, Tim Golden wrote:
> On 29/06/2010 15:26, Steve Holden wrote:
>> Nick Coghlan wrote:
>>> Command line: ./python -m test.regrtest -v test_mailbox
>>>
>>> trunk: Ran 274 tests in 25.239s
>>> py3k: Ran 268 tests in 26.263s
>>>
>>> So I don't see any substantial difference on a Kubuntu 10.04 box (both
>>> builds are recent'ish, but not completely up to date).
>>>
>>> However, the underlying IO access is significantly different between
>>> POSIX and Windows, so there could still be something pathological
>>> happening at the filesystem manipulation layer. My comparisons are
>>> also 2.7 vs 3.2 rather than 2.6 vs 3.1.
>>>
>>> Cheers,
>>> Nick.
>>>
>> Thanks for all the timings! If a Windows user could do the same thing
>> that would help ...
>
> WinXP SP3
>
> 2.6 Ran 272 tests in 13.172s
> 3.1 Ran 267 tests in 15.735s
> py3k A *lot* of ERROR and FAIL tests

py3k HEAD on Win7 Ran 268 tests in 34.055s

TJG

From vinay_sajip at yahoo.co.uk  Tue Jun 29 17:15:22 2010
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Tue, 29 Jun 2010 15:15:22 +0000 (UTC)
Subject: [Python-Dev] Pickle security and remote logging
References: <AANLkTinvmLy1UaOODajeiD3ySn43cIQIPvnZHY4Yi_PS@mail.gmail.com>
Message-ID: <loom.20100629T171329-642@post.gmane.org>

anatoly techtonik <techtonik <at> gmail.com> writes:

> insecure. SocketHandler and DatagramHandler docs should at least
> contain a warning about danger of exposing unpickling interfaces to
> insecure networks.

I've updated the documentation of SocketHandler.makePickle to mention security
concerns, and that the method can be overridden to use a more secure
implementation (e.g. HMAC-signed pickles).

Regards,

Vinay Sajip


From steve at holdenweb.com  Tue Jun 29 17:29:55 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 11:29:55 -0400
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <20100629105012.341adc7b@heresy>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>	<4C268F1E.5070506@egenix.com>	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>	<4C2889B7.2060105@egenix.com>	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>	<4C28ABD4.1030000@egenix.com>	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>	<4C28BF83.9080903@egenix.com>
	<20100629105012.341adc7b@heresy>
Message-ID: <i0d3ht$kco$1@dough.gmane.org>

Barry Warsaw wrote:
> On Jun 28, 2010, at 05:28 PM, M.-A. Lemburg wrote:
> 
>> How many Python users will compile Python in debug mode ?
> 
> How many Python users compile Python at all? :)
> 
>> The point is that the default build of Python should use
>> the correct production settings for the C compiler out of
>> the box and that's what AC_PROG_CC is all about.
> 
> Sure.
> 
>> I'm pretty sure that Python developers who want to use a
>> debug build have enough code foo to get the -O2 turned into a -O0
>> either by adjust OPT and/or by providing their own CFLAGS env var.
> 
> Yes, but it's a PITA for several reasons, IMO:
> 
> * It's pretty underdocumented
> * It's obscure
> * It's hard to remember the exact fu needed because you do it infrequently
> * I usually only remember my mistake when gdb acts funny
> 
> I strongly suggest that --with-pydebug should be all you need to ensure the
> best debugging environment, which means turning off compiler optimization.
> Last time I tried, the -O0 was added and it worked well.  (I know this has
> been in flux though.)
> 
>> Also note that in some cases you may actually want to have
>> a debug build with optimizations turned on, e.g. to track down
>> a compiler optimization bug.
> 
> Yes, but that's *much* more rare than wanting to step through some bit of C
> code without going crazy.

I agree - trying to step through -O2 optimized code isn't going to help
debug your code, it's going to help you debug the optimizer. That's a
very rare use case.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From steve at holdenweb.com  Tue Jun 29 17:40:50 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 11:40:50 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
Message-ID: <i0d46b$n5g$1@dough.gmane.org>

Guido van Rossum wrote:
> On Tue, Jun 29, 2010 at 7:49 AM, Steve Holden <steve at holdenweb.com> wrote:
>> Steve Holden wrote:
>>> Nick Coghlan wrote:
>>>> Command line: ./python -m test.regrtest -v test_mailbox
>>>>
>>>> trunk: Ran 274 tests in 25.239s
>>>> py3k: Ran 268 tests in 26.263s
>>>>
>>>> So I don't see any substantial difference on a Kubuntu 10.04 box (both
>>>> builds are recent'ish, but not completely up to date).
>>>>
>>>> However, the underlying IO access is significantly different between
>>>> POSIX and Windows, so there could still be something pathological
>>>> happening at the filesystem manipulation layer. My comparisons are
>>>> also 2.7 vs 3.2 rather than 2.6 vs 3.1.
>>>>
>>>> Cheers,
>>>> Nick.
>>>>
>>> Thanks for all the timings! If a Windows user could do the same thing
>>> that would help ...
>>>
>> And there is *definitely a performance issue. I created a Thunderbird
>> folder of 26 Google alerts and just parsed then all after reading them
>> in from the mailbox.
>>
>> 2.5 (!):  0.78 sec
>> 3.1    : 42.80 sec
>>
>> Rather than debate the code here perhaps I should just open an issue for
>> this? I can then provide both a program and some data, which can be
>> added to the tests if appropriate. The issue can clearly stand some
>> investigation.
> 
> Since you have such a great reproducible test case, could you point
> the profiler at it? (Perhaps on a reduced dataset... The profiler
> multiples your run time by some number between 2 and 10 IIRC.)
> 
Sure. I attach the outputs of both files, as well as the program and the
data. With profiling (python -m cProfile test3.py) the run took less
than a third of a second under 2.5, and 168 seconds under 3.1. I'd say
that was problematical :)

I will leave the profiler output to speak for itself, since I can find
nothing much to say about it except that there's a hell of a lot of
decoding going on inside mailbox.iterkeys().

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test3.1.out
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100629/53064c86/attachment-0004.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test2.5.out
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100629/53064c86/attachment-0005.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test3.py
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100629/53064c86/attachment-0006.ksh>
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: test.mailbox
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100629/53064c86/attachment-0007.ksh>

From solipsis at pitrou.net  Tue Jun 29 18:34:22 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 29 Jun 2010 18:34:22 +0200
Subject: [Python-Dev] Mailbox module - timings and functionality changes
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
Message-ID: <20100629183422.00f1997d@pitrou.net>

On Tue, 29 Jun 2010 11:40:50 -0400
Steve Holden <steve at holdenweb.com> wrote:
> Sure. I attach the outputs of both files, as well as the program and the
> data. With profiling (python -m cProfile test3.py) the run took less
> than a third of a second under 2.5, and 168 seconds under 3.1. I'd say
> that was problematical :)
> 
> I will leave the profiler output to speak for itself, since I can find
> nothing much to say about it except that there's a hell of a lot of
> decoding going on inside mailbox.iterkeys().

Ok, a lot of time is spent in cp1252 decoding. Somewhat less time, but
still too much of it, is spent in TextIOWrapper.tell(). This seems to
imply that mailbox files are opened in text mode, which sounds wrong to
me. Perhaps Andrew can shed more light on this?


From amk at amk.ca  Tue Jun 29 18:34:42 2010
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 29 Jun 2010 12:34:42 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
Message-ID: <20100629163442.GA5051@amk-desktop.matrixgroup.net>

On Tue, Jun 29, 2010 at 07:56:22AM -0700, Guido van Rossum wrote:
> Since you have such a great reproducible test case, could you point
> the profiler at it? (Perhaps on a reduced dataset... The profiler
> multiples your run time by some number between 2 and 10 IIRC.)

Let me underline Guido's suggestion.  Steve, I've done a lot of
mailbox.py stuff and can look at your problem, but off the top of my
head, my suspicion would be that I/O is the culprit, and a profile
could confirm that.  My thought is that mailbox.py is opening the file
in some reading mode that ends up doing a lot more processing on
Windows than on Unix because of universal newlines or something like
that.

--amk

From amk at amk.ca  Tue Jun 29 18:52:28 2010
From: amk at amk.ca (A.M. Kuchling)
Date: Tue, 29 Jun 2010 12:52:28 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <i0d46b$n5g$1@dough.gmane.org>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
Message-ID: <20100629165228.GA5350@amk-desktop.matrixgroup.net>

On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote:
> I will leave the profiler output to speak for itself, since I can find
> nothing much to say about it except that there's a hell of a lot of
> decoding going on inside mailbox.iterkeys().

The problem is actually in _generate_toc(), which is reading through
the entire file to figure out where all the 'From' lines that start
messages are located.  TextIOWrapper()'s tell() method seems to be
very slow, so one help is to only call tell() when necessary; patch:

-> svn diff Lib/
Index: Lib/mailbox.py
===================================================================
--- Lib/mailbox.py	(revision 82346)
+++ Lib/mailbox.py	(working copy)
@@ -775,13 +775,14 @@
         starts, stops = [], []
         self._file.seek(0)
         while True:
-            line_pos = self._file.tell()
             line = self._file.readline()
             if line.startswith('From '):
+                line_pos = self._file.tell()
                 if len(stops) < len(starts):
                     stops.append(line_pos - len(os.linesep))
                 starts.append(line_pos)
             elif not line:
+                line_pos = self._file.tell()
                 stops.append(line_pos)
                 break
         self._toc = dict(enumerate(zip(starts, stops)))

But should mailboxes really be opened in a UTF-8 encoding, or should
they be treated as 7-bit text?  I'll have to think about this.

--amk

From rdmurray at bitdance.com  Tue Jun 29 19:20:35 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 29 Jun 2010 13:20:35 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <20100629183422.00f1997d@pitrou.net>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org> <20100629183422.00f1997d@pitrou.net>
Message-ID: <20100629172035.8348D21A2AF@kimball.webabinitio.net>

On Tue, 29 Jun 2010 18:34:22 +0200, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Tue, 29 Jun 2010 11:40:50 -0400
> Steve Holden <steve at holdenweb.com> wrote:
> > Sure. I attach the outputs of both files, as well as the program and the
> > data. With profiling (python -m cProfile test3.py) the run took less
> > than a third of a second under 2.5, and 168 seconds under 3.1. I'd say
> > that was problematical :)
> > 
> > I will leave the profiler output to speak for itself, since I can find
> > nothing much to say about it except that there's a hell of a lot of
> > decoding going on inside mailbox.iterkeys().
> 
> Ok, a lot of time is spent in cp1252 decoding. Somewhat less time, but
> still too much of it, is spent in TextIOWrapper.tell(). This seems to
> imply that mailbox files are opened in text mode, which sounds wrong to
> me. Perhaps Andrew can shed more light on this?

Given the current state of the email package for python3, it makes
sense that it would open them in text mode.  email can't currently
process bytes, only text.

--
R. David Murray                                      www.bitdance.com

From solipsis at pitrou.net  Tue Jun 29 19:30:53 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Tue, 29 Jun 2010 19:30:53 +0200
Subject: [Python-Dev] Mailbox module - timings and functionality changes
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
Message-ID: <20100629193053.750991e1@pitrou.net>

On Tue, 29 Jun 2010 12:52:28 -0400
"A.M. Kuchling" <amk at amk.ca> wrote:
> 
> But should mailboxes really be opened in a UTF-8 encoding, or should
> they be treated as 7-bit text?  I'll have to think about this.

I don't see how you can assume UTF-8 for mailbox files, given that each
message will have its particular encoding.
Besides, Steve's profile results show that you are not using UTF-8, but
rather the local encoding, which is cp1252 under his Windows setup.

Regards

Antoine.


From steve at holdenweb.com  Tue Jun 29 19:54:09 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 13:54:09 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <20100629165228.GA5350@amk-desktop.matrixgroup.net>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
Message-ID: <4C2A3341.4010705@holdenweb.com>

A.M. Kuchling wrote:
> On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote:
>> I will leave the profiler output to speak for itself, since I can find
>> nothing much to say about it except that there's a hell of a lot of
>> decoding going on inside mailbox.iterkeys().
> 
> The problem is actually in _generate_toc(), which is reading through
> the entire file to figure out where all the 'From' lines that start
> messages are located.  TextIOWrapper()'s tell() method seems to be
> very slow, so one help is to only call tell() when necessary; patch:
> 
> -> svn diff Lib/
> Index: Lib/mailbox.py
> ===================================================================
> --- Lib/mailbox.py	(revision 82346)
> +++ Lib/mailbox.py	(working copy)
> @@ -775,13 +775,14 @@
>          starts, stops = [], []
>          self._file.seek(0)
>          while True:
> -            line_pos = self._file.tell()
>              line = self._file.readline()
>              if line.startswith('From '):
> +                line_pos = self._file.tell()
>                  if len(stops) < len(starts):
>                      stops.append(line_pos - len(os.linesep))
>                  starts.append(line_pos)
>              elif not line:
> +                line_pos = self._file.tell()
>                  stops.append(line_pos)
>                  break
>          self._toc = dict(enumerate(zip(starts, stops)))
> 
> But should mailboxes really be opened in a UTF-8 encoding, or should
> they be treated as 7-bit text?  I'll have to think about this.

Neither! You can't open them as 7-bit text, because real-world email
does contain bytes whose ordinal value exceeds 127. You can't open them
using a text encoding because theoretically there might be ASCII headers
that indicate that parts of the content are in specific character sets
or encodings.

If only we had a data structure that easily allowed us to manipulate
8-bit characters ...

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000

From guido at python.org  Tue Jun 29 21:26:31 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Jun 2010 12:26:31 -0700
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <4C2A3341.4010705@holdenweb.com>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com> 
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org> 
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com> 
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net> 
	<4C2A3341.4010705@holdenweb.com>
Message-ID: <AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>

It should probably be opened in binary mode. Binary files do have a
.readline() method (returning a bytes object), and bytes objects have
a .startswith() method. The tell positions computed this way are even
compatible with those used by the text file. So you could do it this
way:

- open binary stream
- compute TOC by reading through it using .readline() and .tell()
- rewind (don't close)
- wrap the binary stream in a text stream
- use that for the rest of the code

--Guido

On Tue, Jun 29, 2010 at 10:54 AM, Steve Holden <steve at holdenweb.com> wrote:
> A.M. Kuchling wrote:
>> On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote:
>>> I will leave the profiler output to speak for itself, since I can find
>>> nothing much to say about it except that there's a hell of a lot of
>>> decoding going on inside mailbox.iterkeys().
>>
>> The problem is actually in _generate_toc(), which is reading through
>> the entire file to figure out where all the 'From' lines that start
>> messages are located. ?TextIOWrapper()'s tell() method seems to be
>> very slow, so one help is to only call tell() when necessary; patch:
>>
>> -> svn diff Lib/
>> Index: Lib/mailbox.py
>> ===================================================================
>> --- Lib/mailbox.py ? ?(revision 82346)
>> +++ Lib/mailbox.py ? ?(working copy)
>> @@ -775,13 +775,14 @@
>> ? ? ? ? ?starts, stops = [], []
>> ? ? ? ? ?self._file.seek(0)
>> ? ? ? ? ?while True:
>> - ? ? ? ? ? ?line_pos = self._file.tell()
>> ? ? ? ? ? ? ?line = self._file.readline()
>> ? ? ? ? ? ? ?if line.startswith('From '):
>> + ? ? ? ? ? ? ? ?line_pos = self._file.tell()
>> ? ? ? ? ? ? ? ? ?if len(stops) < len(starts):
>> ? ? ? ? ? ? ? ? ? ? ?stops.append(line_pos - len(os.linesep))
>> ? ? ? ? ? ? ? ? ?starts.append(line_pos)
>> ? ? ? ? ? ? ?elif not line:
>> + ? ? ? ? ? ? ? ?line_pos = self._file.tell()
>> ? ? ? ? ? ? ? ? ?stops.append(line_pos)
>> ? ? ? ? ? ? ? ? ?break
>> ? ? ? ? ?self._toc = dict(enumerate(zip(starts, stops)))
>>
>> But should mailboxes really be opened in a UTF-8 encoding, or should
>> they be treated as 7-bit text? ?I'll have to think about this.
>
> Neither! You can't open them as 7-bit text, because real-world email
> does contain bytes whose ordinal value exceeds 127. You can't open them
> using a text encoding because theoretically there might be ASCII headers
> that indicate that parts of the content are in specific character sets
> or encodings.
>
> If only we had a data structure that easily allowed us to manipulate
> 8-bit characters ...
>
> regards
> ?Steve
> --
> Steve Holden ? ? ? ? ? +1 571 484 6266 ? +1 800 494 3119
> See Python Video! ? ? ? http://python.mirocommunity.org/
> Holden Web LLC ? ? ? ? ? ? ? ? http://www.holdenweb.com/
> UPCOMING EVENTS: ? ? ? ?http://holdenweb.eventbrite.com/
> "All I want for my birthday is another birthday" -
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? Ian Dury, 1942-2000
> _______________________________________________
> Python-Dev mailing list
> Python-Dev at python.org
> http://mail.python.org/mailman/listinfo/python-dev
> Unsubscribe: http://mail.python.org/mailman/options/python-dev/guido%40python.org
>


-- 
--Guido van Rossum (python.org/~guido)

From steve at holdenweb.com  Tue Jun 29 23:02:14 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 17:02:14 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
	<4C2A3341.4010705@holdenweb.com>
	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>
Message-ID: <4C2A5F56.2010700@holdenweb.com>

Guido van Rossum wrote:
> It should probably be opened in binary mode. Binary files do have a
> .readline() method (returning a bytes object), and bytes objects have
> a .startswith() method. The tell positions computed this way are even
> compatible with those used by the text file. So you could do it this
> way:
> 
> - open binary stream
> - compute TOC by reading through it using .readline() and .tell()
> - rewind (don't close)

Because closing is inefficient, or because it breaks the algorithm?

> - wrap the binary stream in a text stream

"wrap" how? The ultimate destiny of the text is twofold:

1) To be stored as some kind of LOB in a database, and
2) Therefrom to be reconstituted and parsed into email.Message objects.

Is the wrapping a one-off operation or a software layer? Sorry, being a
bit dense here, I know.

regards
 Steve

> - use that for the rest of the code
> 
> --Guido
> 
> On Tue, Jun 29, 2010 at 10:54 AM, Steve Holden <steve at holdenweb.com> wrote:
>> A.M. Kuchling wrote:
>>> On Tue, Jun 29, 2010 at 11:40:50AM -0400, Steve Holden wrote:
>>>> I will leave the profiler output to speak for itself, since I can find
>>>> nothing much to say about it except that there's a hell of a lot of
>>>> decoding going on inside mailbox.iterkeys().
>>> The problem is actually in _generate_toc(), which is reading through
>>> the entire file to figure out where all the 'From' lines that start
>>> messages are located.  TextIOWrapper()'s tell() method seems to be
>>> very slow, so one help is to only call tell() when necessary; patch:
>>>
>>> -> svn diff Lib/
>>> Index: Lib/mailbox.py
>>> ===================================================================
>>> --- Lib/mailbox.py    (revision 82346)
>>> +++ Lib/mailbox.py    (working copy)
>>> @@ -775,13 +775,14 @@
>>>          starts, stops = [], []
>>>          self._file.seek(0)
>>>          while True:
>>> -            line_pos = self._file.tell()
>>>              line = self._file.readline()
>>>              if line.startswith('From '):
>>> +                line_pos = self._file.tell()
>>>                  if len(stops) < len(starts):
>>>                      stops.append(line_pos - len(os.linesep))
>>>                  starts.append(line_pos)
>>>              elif not line:
>>> +                line_pos = self._file.tell()
>>>                  stops.append(line_pos)
>>>                  break
>>>          self._toc = dict(enumerate(zip(starts, stops)))
>>>
>>> But should mailboxes really be opened in a UTF-8 encoding, or should
>>> they be treated as 7-bit text?  I'll have to think about this.
>> Neither! You can't open them as 7-bit text, because real-world email
>> does contain bytes whose ordinal value exceeds 127. You can't open them
>> using a text encoding because theoretically there might be ASCII headers
>> that indicate that parts of the content are in specific character sets
>> or encodings.
>>
>> If only we had a data structure that easily allowed us to manipulate
>> 8-bit characters ...
>>
>> regards
>>  Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From techtonik at gmail.com  Wed Jun 30 01:22:59 2010
From: techtonik at gmail.com (anatoly techtonik)
Date: Wed, 30 Jun 2010 02:22:59 +0300
Subject: [Python-Dev] Pickle security and remote logging
In-Reply-To: <loom.20100629T171329-642@post.gmane.org>
References: <AANLkTinvmLy1UaOODajeiD3ySn43cIQIPvnZHY4Yi_PS@mail.gmail.com>
	<loom.20100629T171329-642@post.gmane.org>
Message-ID: <AANLkTimqepJXp-HsaqGf6XpsZqOINVlwgMAsPzkd-lFD@mail.gmail.com>

On Tue, Jun 29, 2010 at 6:15 PM, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
>
> I've updated the documentation of SocketHandler.makePickle to mention security
> concerns, and that the method can be overridden to use a more secure
> implementation (e.g. HMAC-signed pickles).

Thanks. But I doubt HMAC complication helps to protect logging server.
If shared key is compromised -server becomes vulnerable. I would
prefer approach when no code execution is possible. Some alternative
serialization way for transmitting log data structures over network.
Protocol buffers first come in mind, but they seem to be an overkill,
and stdlib doesn't include any implementation.

-- 
anatoly t.

From guido at python.org  Wed Jun 30 01:41:52 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Jun 2010 16:41:52 -0700
Subject: [Python-Dev] Pickle security and remote logging
In-Reply-To: <AANLkTimqepJXp-HsaqGf6XpsZqOINVlwgMAsPzkd-lFD@mail.gmail.com>
References: <AANLkTinvmLy1UaOODajeiD3ySn43cIQIPvnZHY4Yi_PS@mail.gmail.com> 
	<loom.20100629T171329-642@post.gmane.org>
	<AANLkTimqepJXp-HsaqGf6XpsZqOINVlwgMAsPzkd-lFD@mail.gmail.com>
Message-ID: <AANLkTikMY433SWKU9f9gMRj4Gl9i810G-DIzNN3zS9sb@mail.gmail.com>

On Tue, Jun 29, 2010 at 4:22 PM, anatoly techtonik <techtonik at gmail.com> wrote:
> On Tue, Jun 29, 2010 at 6:15 PM, Vinay Sajip <vinay_sajip at yahoo.co.uk> wrote:
>>
>> I've updated the documentation of SocketHandler.makePickle to mention security
>> concerns, and that the method can be overridden to use a more secure
>> implementation (e.g. HMAC-signed pickles).
>
> Thanks. But I doubt HMAC complication helps to protect logging server.
> If shared key is compromised -server becomes vulnerable. I would
> prefer approach when no code execution is possible. Some alternative
> serialization way for transmitting log data structures over network.
> Protocol buffers first come in mind, but they seem to be an overkill,
> and stdlib doesn't include any implementation.

You could use marshal by default. It does not execute code when
unmarshalling. A limitation is that it only supports built-in types
like list, dict, string etc. but that might be just fine for logging
data. Another option would be JSON. (Or XML, if you want bulky. :-)

As for protocol buffers, assuming its absence (so far :-) from the
stdlib is the only objection, how hard would it be to make the logging
package "prepared" so that if one *did* have protocol buffers
installed, it would be a one-line config setting to use them?

-- 
--Guido van Rossum (python.org/~guido)

From rdmurray at bitdance.com  Wed Jun 30 01:56:30 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 29 Jun 2010 19:56:30 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <4C2A3341.4010705@holdenweb.com>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
	<4C2A3341.4010705@holdenweb.com>
Message-ID: <20100629235630.E02B61FDDBE@kimball.webabinitio.net>

On Tue, 29 Jun 2010 13:54:09 -0400, Steve Holden <steve at holdenweb.com> wrote:
> A.M. Kuchling wrote:
> > But should mailboxes really be opened in a UTF-8 encoding, or should
> > they be treated as 7-bit text?  I'll have to think about this.
> 
> Neither! You can't open them as 7-bit text, because real-world email
> does contain bytes whose ordinal value exceeds 127. You can't open them
> using a text encoding because theoretically there might be ASCII headers
> that indicate that parts of the content are in specific character sets
> or encodings.
> 
> If only we had a data structure that easily allowed us to manipulate
> 8-bit characters ...

email6 *will* handle this use case.  When it exists :)  But note that it
is *not* just a matter of easily handling 8 bit characters.  There are
a whole bunch of algorithms needed for interpreting that 7 and 8 bit data.
All the info is there in the email headers, but being able to do string
operations on 8 bit byte strings doesn't get you the answers you need
by itself.

It really is the case that the Python3 bytes/unicode split forces us
to redo most of the algorithms so that they handle bytes and text
*correctly*.  This isn't a trivial undertaking, but the end result
will be well worth it.

--
R. David Murray                                      www.bitdance.com

From rdmurray at bitdance.com  Wed Jun 30 02:05:29 2010
From: rdmurray at bitdance.com (R. David Murray)
Date: Tue, 29 Jun 2010 20:05:29 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <4C2A5F56.2010700@holdenweb.com>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
	<4C2A3341.4010705@holdenweb.com>
	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>
	<4C2A5F56.2010700@holdenweb.com>
Message-ID: <20100630000529.3AA351FF08C@kimball.webabinitio.net>

On Tue, 29 Jun 2010 17:02:14 -0400, Steve Holden <steve at holdenweb.com> wrote:
> Guido van Rossum wrote:
> 
> > - wrap the binary stream in a text stream
> 
> "wrap" how? The ultimate destiny of the text is twofold:

I would imagine Guido is talking about an io.TextIOWrapper...in other
words, take the binary file you've just finished grabbing info
from, and reread it as a text file in order to grab the actual
message content.

If you have messages in your files that are using an 8bit content
transfer encoding, then you (currently) will have some problems
unless the charset happens to be the one you use when you wrap
the binary stream as a text stream.

--
R. David Murray                                      www.bitdance.com

From steve at holdenweb.com  Wed Jun 30 02:31:59 2010
From: steve at holdenweb.com (Steve Holden)
Date: Tue, 29 Jun 2010 20:31:59 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <20100629235630.E02B61FDDBE@kimball.webabinitio.net>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
	<4C2A3341.4010705@holdenweb.com>
	<20100629235630.E02B61FDDBE@kimball.webabinitio.net>
Message-ID: <4C2A907F.1010409@holdenweb.com>

R. David Murray wrote:
> On Tue, 29 Jun 2010 13:54:09 -0400, Steve Holden <steve at holdenweb.com> wrote:
>> A.M. Kuchling wrote:
>>> But should mailboxes really be opened in a UTF-8 encoding, or should
>>> they be treated as 7-bit text?  I'll have to think about this.
>> Neither! You can't open them as 7-bit text, because real-world email
>> does contain bytes whose ordinal value exceeds 127. You can't open them
>> using a text encoding because theoretically there might be ASCII headers
>> that indicate that parts of the content are in specific character sets
>> or encodings.
>>
>> If only we had a data structure that easily allowed us to manipulate
>> 8-bit characters ...
> 
> email6 *will* handle this use case.  When it exists :)  But note that it
> is *not* just a matter of easily handling 8 bit characters.  There are
> a whole bunch of algorithms needed for interpreting that 7 and 8 bit data.
> All the info is there in the email headers, but being able to do string
> operations on 8 bit byte strings doesn't get you the answers you need
> by itself.
> 
> It really is the case that the Python3 bytes/unicode split forces us
> to redo most of the algorithms so that they handle bytes and text
> *correctly*.  This isn't a trivial undertaking, but the end result
> will be well worth it.
> 
I completely agree. The unusual thing here is that I of all people
should find himself running into these issues, since my use of Python is
normally pretty conservative. Since the course I am currently writing is
already overdue I have to find answers now to problems that were present
in the initial 3.0 release and have not received much attention since.

You know that I support your work to revise the email package. I hope
that we can eventually have it incorporate mailbox readers as well.

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/
UPCOMING EVENTS:        http://holdenweb.eventbrite.com/
"All I want for my birthday is another birthday" -
                                     Ian Dury, 1942-2000


From janssen at parc.com  Wed Jun 30 04:55:12 2010
From: janssen at parc.com (Bill Janssen)
Date: Tue, 29 Jun 2010 19:55:12 PDT
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
Message-ID: <71728.1277866512@parc.com>

My Leopard and Tiger PPC buildbots are momentarily green!  But I'm
looking into why I'm skipping some tests.  My buildbots are up-to-date
OS-wise and very vanilla, with the latest applicable Xcode.

4 skips unexpected on darwin:
    test_gdb test_ioctl test_readline test_ttk_guionly

Three of these (gdb, readline, ttk_guionly) are just bad predictions of
which tests should skip on Darwin, I think -- gdb is only version 6, so
that test won't run, readline doesn't get built, ttk doesn't work
without Tcl/Tk 8.5.  But the the skip of test_ioctl baffles me.

"test_ioctl skipped -- Unable to open /dev/tty"

But when I log in via ssh and try it with the system python:

~ wjanssen$ python
python
Python 2.5.1 (r251:54863, Jun 17 2009, 20:37:34) 
[GCC 4.0.1 (Apple Inc. build 5465)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> open("/dev/tty")
open("/dev/tty")
<open file '/dev/tty', mode 'r' at 0x597b8>
>>> 

Seems to work fine.  So this I don't understand.  Any ideas, anyone?

Bill

From stephen at xemacs.org  Wed Jun 30 04:55:02 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 30 Jun 2010 11:55:02 +0900
Subject: [Python-Dev] what environment variable should contain compiler
 warning suppression flags?
In-Reply-To: <i0d3ht$kco$1@dough.gmane.org>
References: <AANLkTil5y27nG0Jn7-xgAeTJL8o3qPjt91aTH5Lou3K_@mail.gmail.com>
	<AANLkTinsrUUG9qdeunkxG0g7Lqy3kkZPpoF_A_Osq7rZ@mail.gmail.com>
	<4C268F1E.5070506@egenix.com>
	<AANLkTinMzuSMTHzdcgTYIOmUiKELtAwQxTXCTh1M_U7J@mail.gmail.com>
	<AANLkTimeyppyUf0_SyblLFF3oo3_Zo0bDAmGE45JP0kJ@mail.gmail.com>
	<AANLkTikvbSzG1yqOSDjtX9qieZWvTcImya8ToQbTGz7l@mail.gmail.com>
	<AANLkTimq3Ec9tZZUze-JJnwGeuLVTNjuIAkz4zmAR7kf@mail.gmail.com>
	<4C2889B7.2060105@egenix.com>
	<AANLkTik0fadPz0lMjDOAUDeak2PMM6Mz3uTSwCwdAKBv@mail.gmail.com>
	<4C28ABD4.1030000@egenix.com>
	<AANLkTimYAPuI8Nq0S26thoT4_nZd3T4MVKh8tgM5vl4t@mail.gmail.com>
	<4C28BF83.9080903@egenix.com> <20100629105012.341adc7b@heresy>
	<i0d3ht$kco$1@dough.gmane.org>
Message-ID: <87y6dxb56h.fsf@uwakimon.sk.tsukuba.ac.jp>

Steve Holden writes:

 > I agree - trying to step through -O2 optimized code isn't going to
 > help debug your code, it's going to help you debug the
 > optimizer. That's a very rare use case.

Not really.  I don't have a lot of practice in debugging at that
level, so take it with a grain of salt, but what I've found with
XEmacs code is that debugging at -O0 is less often helpful than
debugging at -O2.  Quite often a naive compilation strategy is used
which basically turns those C statements into macros for the
underlying assembler, and the code works the way the author thinks it
should.  But his assumptions are invalid, and when optimized it fails.

So I guess you can call that "debugging the optimizer" if you like....

From guido at python.org  Wed Jun 30 05:57:09 2010
From: guido at python.org (Guido van Rossum)
Date: Tue, 29 Jun 2010 20:57:09 -0700
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <71728.1277866512@parc.com>
References: <71728.1277866512@parc.com>
Message-ID: <AANLkTimUuphnbKMCvrE7geY-3ABPYFMjz-11Cs_Vxst2@mail.gmail.com>

On Tue, Jun 29, 2010 at 7:55 PM, Bill Janssen <janssen at parc.com> wrote:
> My Leopard and Tiger PPC buildbots are momentarily green! ?But I'm
> looking into why I'm skipping some tests. ?My buildbots are up-to-date
> OS-wise and very vanilla, with the latest applicable Xcode.
>
> 4 skips unexpected on darwin:
> ? ?test_gdb test_ioctl test_readline test_ttk_guionly
>
> Three of these (gdb, readline, ttk_guionly) are just bad predictions of
> which tests should skip on Darwin, I think -- gdb is only version 6, so
> that test won't run, readline doesn't get built, ttk doesn't work
> without Tcl/Tk 8.5.

So it looks like you gould get readline and ttk to run and pass by
separately downloading and installing readline (I've done this many
times before) and Tcl/Tk (no idea but I suppose it should work).

>?But the the skip of test_ioctl baffles me.
>
> "test_ioctl skipped -- Unable to open /dev/tty"
>
> But when I log in via ssh and try it with the system python:
>
> ~ wjanssen$ python
> python
> Python 2.5.1 (r251:54863, Jun 17 2009, 20:37:34)
> [GCC 4.0.1 (Apple Inc. build 5465)] on darwin
> Type "help", "copyright", "credits" or "license" for more information.
>>>> open("/dev/tty")
> open("/dev/tty")
> <open file '/dev/tty', mode 'r' at 0x597b8>
>>>>
>
> Seems to work fine. ?So this I don't understand. ?Any ideas, anyone?

Maybe the buildbot runs the tests as a tty-less daemon process. If you
ask me it's pretty crazy to have a test that requires a tty. But there
you have it -- and it's the same in Python 3. (But then again, who
knows, I might have written that test. ;-)

-- 
--Guido van Rossum (python.org/~guido)

From martin at v.loewis.de  Wed Jun 30 07:24:33 2010
From: martin at v.loewis.de (=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 30 Jun 2010 07:24:33 +0200
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <71728.1277866512@parc.com>
References: <71728.1277866512@parc.com>
Message-ID: <4C2AD511.5020709@v.loewis.de>

> Seems to work fine.  So this I don't understand.  Any ideas, anyone?

Didn't we discuss this before? The buildbot slave has no controlling
terminal anymore, hence it cannot open /dev/tty. If you are curious,
just patch your checkout to output the exact errno (e.g. to stdout),
and trigger a build through the web.

Regards,
Martin

From martin at v.loewis.de  Wed Jun 30 07:37:18 2010
From: martin at v.loewis.de (=?ISO-8859-15?Q?=22Martin_v=2E_L=F6wis=22?=)
Date: Wed, 30 Jun 2010 07:37:18 +0200
Subject: [Python-Dev] Taking over the Mercurial Migration
Message-ID: <4C2AD80E.9010404@v.loewis.de>

It seems that both Dirkjan and Brett are very caught up
with real life for the coming months. So I suggest that
some other committer who favors the Mercurial transition
steps forward and takes over this project.

If nobody volunteers, I propose that we release 3.2
from Subversion, and reconsider Mercurial migration
next year.

Regards,
Martin

From stephen at xemacs.org  Wed Jun 30 08:19:37 2010
From: stephen at xemacs.org (Stephen J. Turnbull)
Date: Wed, 30 Jun 2010 15:19:37 +0900
Subject: [Python-Dev]  Taking over the Mercurial Migration
In-Reply-To: <4C2AD80E.9010404@v.loewis.de>
References: <4C2AD80E.9010404@v.loewis.de>
Message-ID: <87sk45avpi.fsf@uwakimon.sk.tsukuba.ac.jp>

"Martin v. L?wis" writes:

 > It seems that both Dirkjan and Brett are very caught up
 > with real life for the coming months. So I suggest that
 > some other committer who favors the Mercurial transition
 > steps forward and takes over this project.

I am not a committer, and am not intimately familiar with PEP 385, so
not appropriate to become the proponent, I think.  However, I am one
of the PEP 374 co-authors, and have experience with previous
transition to Mercurial of similar scale (XEmacs).  I can promise to
devote time to the transition in July and August, in support of
whoever might step forward.  I hope someone does.

 > If nobody volunteers, I propose that we release 3.2
 > from Subversion, and reconsider Mercurial migration
 > next year.

In the absence of a volunteer, I think that's probably necessary.

From g.brandl at gmx.net  Wed Jun 30 10:41:51 2010
From: g.brandl at gmx.net (Georg Brandl)
Date: Wed, 30 Jun 2010 10:41:51 +0200
Subject: [Python-Dev] Taking over the Mercurial Migration
In-Reply-To: <4C2AD80E.9010404@v.loewis.de>
References: <4C2AD80E.9010404@v.loewis.de>
Message-ID: <i0evum$985$1@dough.gmane.org>

Am 30.06.2010 07:37, schrieb "Martin v. L?wis":
> It seems that both Dirkjan and Brett are very caught up
> with real life for the coming months. So I suggest that
> some other committer who favors the Mercurial transition
> steps forward and takes over this project.
> 
> If nobody volunteers, I propose that we release 3.2
> from Subversion, and reconsider Mercurial migration
> next year.

IIUC, Dirkjan is only caught up for another month.  I have
no problems with releasing a first 3.2 alpha from SVN and
then switching, so I propose that we target the migration
for August -- I can help in the second half of August if
needed.

Georg


From vinay_sajip at yahoo.co.uk  Wed Jun 30 11:23:37 2010
From: vinay_sajip at yahoo.co.uk (Vinay Sajip)
Date: Wed, 30 Jun 2010 09:23:37 +0000 (UTC)
Subject: [Python-Dev] Pickle security and remote logging
References: <AANLkTinvmLy1UaOODajeiD3ySn43cIQIPvnZHY4Yi_PS@mail.gmail.com>
	<loom.20100629T171329-642@post.gmane.org>
	<AANLkTimqepJXp-HsaqGf6XpsZqOINVlwgMAsPzkd-lFD@mail.gmail.com>
	<AANLkTikMY433SWKU9f9gMRj4Gl9i810G-DIzNN3zS9sb@mail.gmail.com>
Message-ID: <loom.20100630T111450-631@post.gmane.org>

Guido van Rossum <guido <at> python.org> writes:

> As for protocol buffers, assuming its absence (so far  from the
> stdlib is the only objection, how hard would it be to make the logging
> package "prepared" so that if one *did* have protocol buffers
> installed, it would be a one-line config setting to use them?

I envisage that if protocol buffers were available, and if support for them in
logging was to be added, this could be done via an optional keyword arg to the
SocketHandler which sets a handler attribute, which would then be used in
makePickle to make the required serialized form.

@anatoly: The documentation just mentions HMAC as an example; the levels of
paranoia to be applied are different for different people, different times and
different situations ;-) I assume that someone reading the docs could readily
see that they could substitute "sign the pickle" with some alternative strategy
in makePickle. You could implement marshal, protocol buffers etc. right now just
by overriding SocketHandler.makePickle in your custom class.

An alternative strategy would be to provide an optional serializer=None callable
in the SocketHandler constructor. If specified, then makePickle would call this
serializer with the LogRecord instance as the only argument, and use the return
value as the serialized form, instead of calling pickle.dumps.

Regards,

Vinay Sajip


From exarkun at twistedmatrix.com  Wed Jun 30 13:32:32 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Wed, 30 Jun 2010 11:32:32 -0000
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <4C2AD511.5020709@v.loewis.de>
References: <71728.1277866512@parc.com>
	<4C2AD511.5020709@v.loewis.de>
Message-ID: <20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>

On 05:24 am, martin at v.loewis.de wrote:
>>Seems to work fine.  So this I don't understand.  Any ideas, anyone?
>
>Didn't we discuss this before? The buildbot slave has no controlling
>terminal anymore, hence it cannot open /dev/tty. If you are curious,
>just patch your checkout to output the exact errno (e.g. to stdout),
>and trigger a build through the web.

Could the test be rewritten (or supplemented) to use a pty?  Most or 
perhaps all of the same operations should be supported.

Jean-Paul

From steve at holdenweb.com  Wed Jun 30 14:42:05 2010
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 30 Jun 2010 08:42:05 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <20100630000529.3AA351FF08C@kimball.webabinitio.net>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>	<4C2A0294.3070806@holdenweb.com>
	<i0d154$b1t$1@dough.gmane.org>	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>	<i0d46b$n5g$1@dough.gmane.org>	<20100629165228.GA5350@amk-desktop.matrixgroup.net>	<4C2A3341.4010705@holdenweb.com>	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>	<4C2A5F56.2010700@holdenweb.com>
	<20100630000529.3AA351FF08C@kimball.webabinitio.net>
Message-ID: <4C2B3B9D.3080200@holdenweb.com>

R. David Murray wrote:
> On Tue, 29 Jun 2010 17:02:14 -0400, Steve Holden <steve at holdenweb.com> wrote:
>> Guido van Rossum wrote:
>>
>>> - wrap the binary stream in a text stream
>> "wrap" how? The ultimate destiny of the text is twofold:
> 
> I would imagine Guido is talking about an io.TextIOWrapper...in other
> words, take the binary file you've just finished grabbing info
> from, and reread it as a text file in order to grab the actual
> message content.
> 
> If you have messages in your files that are using an 8bit content
> transfer encoding, then you (currently) will have some problems
> unless the charset happens to be the one you use when you wrap
> the binary stream as a text stream.
> 
http://bugs.python.org/issue9124

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
DjangoCon US September 7-9, 2010    http://djangocon.us/
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/

From steve at holdenweb.com  Wed Jun 30 14:42:05 2010
From: steve at holdenweb.com (Steve Holden)
Date: Wed, 30 Jun 2010 08:42:05 -0400
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <20100630000529.3AA351FF08C@kimball.webabinitio.net>
References: <i0cu24$u02$1@dough.gmane.org>	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>	<4C2A0294.3070806@holdenweb.com>
	<i0d154$b1t$1@dough.gmane.org>	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>	<i0d46b$n5g$1@dough.gmane.org>	<20100629165228.GA5350@amk-desktop.matrixgroup.net>	<4C2A3341.4010705@holdenweb.com>	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>	<4C2A5F56.2010700@holdenweb.com>
	<20100630000529.3AA351FF08C@kimball.webabinitio.net>
Message-ID: <4C2B3B9D.3080200@holdenweb.com>

R. David Murray wrote:
> On Tue, 29 Jun 2010 17:02:14 -0400, Steve Holden <steve at holdenweb.com> wrote:
>> Guido van Rossum wrote:
>>
>>> - wrap the binary stream in a text stream
>> "wrap" how? The ultimate destiny of the text is twofold:
> 
> I would imagine Guido is talking about an io.TextIOWrapper...in other
> words, take the binary file you've just finished grabbing info
> from, and reread it as a text file in order to grab the actual
> message content.
> 
> If you have messages in your files that are using an 8bit content
> transfer encoding, then you (currently) will have some problems
> unless the charset happens to be the one you use when you wrap
> the binary stream as a text stream.
> 
http://bugs.python.org/issue9124

regards
 Steve
-- 
Steve Holden           +1 571 484 6266   +1 800 494 3119
DjangoCon US September 7-9, 2010    http://djangocon.us/
See Python Video!       http://python.mirocommunity.org/
Holden Web LLC                 http://www.holdenweb.com/


From janssen at parc.com  Wed Jun 30 18:00:09 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 30 Jun 2010 09:00:09 PDT
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <AANLkTimUuphnbKMCvrE7geY-3ABPYFMjz-11Cs_Vxst2@mail.gmail.com>
References: <71728.1277866512@parc.com>
	<AANLkTimUuphnbKMCvrE7geY-3ABPYFMjz-11Cs_Vxst2@mail.gmail.com>
Message-ID: <68796.1277913609@parc.com>

Guido van Rossum <guido at python.org> wrote:

> On Tue, Jun 29, 2010 at 7:55 PM, Bill Janssen <janssen at parc.com> wrote:
> > My Leopard and Tiger PPC buildbots are momentarily green! ?But I'm
> > looking into why I'm skipping some tests. ?My buildbots are up-to-date
> > OS-wise and very vanilla, with the latest applicable Xcode.
> >
> > 4 skips unexpected on darwin:
> > ? ?test_gdb test_ioctl test_readline test_ttk_guionly
> >
> > Three of these (gdb, readline, ttk_guionly) are just bad predictions of
> > which tests should skip on Darwin, I think -- gdb is only version 6, so
> > that test won't run, readline doesn't get built, ttk doesn't work
> > without Tcl/Tk 8.5.
> 
> So it looks like you gould get readline and ttk to run and pass by
> separately downloading and installing readline (I've done this many
> times before) and Tcl/Tk (no idea but I suppose it should work).

Sure.  But the skips should be expected "on Darwin", since a vanilla OS
X system apparently won't have the necessary bits.  At the very least,
regrtest.py should test for these conditions and add them to the
"expected skips" list if necessary.  I'll work up a patch.

> >?But the the skip of test_ioctl baffles me.
> >
> > "test_ioctl skipped -- Unable to open /dev/tty"
> >
> > But when I log in via ssh and try it with the system python:
> >
> > ~ wjanssen$ python
> > python
> > Python 2.5.1 (r251:54863, Jun 17 2009, 20:37:34)
> > [GCC 4.0.1 (Apple Inc. build 5465)] on darwin
> > Type "help", "copyright", "credits" or "license" for more information.
> >>>> open("/dev/tty")
> > open("/dev/tty")
> > <open file '/dev/tty', mode 'r' at 0x597b8>
> >>>>
> >
> > Seems to work fine. ?So this I don't understand. ?Any ideas, anyone?
> 
> Maybe the buildbot runs the tests as a tty-less daemon process. If you
> ask me it's pretty crazy to have a test that requires a tty. But there
> you have it -- and it's the same in Python 3. (But then again, who
> knows, I might have written that test. ;-)

So, my question then is, why are these skips "unexpected"?  Seems to me
that if this is the case, this test will never run on any platform.

Bill

From janssen at parc.com  Wed Jun 30 18:03:15 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 30 Jun 2010 09:03:15 PDT
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <4C2AD511.5020709@v.loewis.de>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
Message-ID: <68821.1277913795@parc.com>

Martin v. L?wis <martin at v.loewis.de> wrote:

> > Seems to work fine.  So this I don't understand.  Any ideas, anyone?
> 
> Didn't we discuss this before?

Possibly, but I don't recall doing so.

> The buildbot slave has no controlling
> terminal anymore, hence it cannot open /dev/tty. If you are curious,
> just patch your checkout to output the exact errno (e.g. to stdout),
> and trigger a build through the web.

So, why is skipping this test "unexpected"?  I see "x86 Tiger" is also
showing this as an unexpected skip.  Should I just add it to the list of
expected skips on Darwin?  Actually, will it run on any platform?

Bill

From janssen at parc.com  Wed Jun 30 18:26:24 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 30 Jun 2010 09:26:24 PDT
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
	<20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>
Message-ID: <69334.1277915184@parc.com>

exarkun at twistedmatrix.com wrote:

> Could the test be rewritten (or supplemented) to use a pty?  Most or
> perhaps all of the same operations should be supported.

Buildbot seems to be explicitly not using a PTY.  From the the top of
the test output:

make buildbottest
 in dir /Users/buildbot/buildarea/trunk.parc-leopard-1/build (timeout 1800 secs)
 watching logfiles {}
 argv: ['make', 'buildbottest']
 [...]
 closing stdin
 using PTY: False

I believe this is specified by the build master.

This test seems to work on Ubuntu and FreeBSD, though.

Bill

From solipsis at pitrou.net  Wed Jun 30 18:42:58 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 30 Jun 2010 18:42:58 +0200
Subject: [Python-Dev] Mailbox module - timings and functionality changes
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
	<4C2A3341.4010705@holdenweb.com>
	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>
	<4C2A5F56.2010700@holdenweb.com>
	<20100630000529.3AA351FF08C@kimball.webabinitio.net>
Message-ID: <20100630184258.473d8535@pitrou.net>

On Tue, 29 Jun 2010 20:05:29 -0400
"R. David Murray" <rdmurray at bitdance.com> wrote:
> 
> I would imagine Guido is talking about an io.TextIOWrapper...in other
> words, take the binary file you've just finished grabbing info
> from, and reread it as a text file in order to grab the actual
> message content.

This sounds a bit suboptimal to me (and introduces race conditions if
e.g. the file is replaced with another one before you reopen it). You
could instead decode the binary data by yourself, especially if you
have already stored that data somewhere.

Also, please note that values used by seek() and tell() on
text I/O are "opaque cookies". While they can happen to match the
raw binary file position, it is a mere coincidence (or an
implementation detail, at your will). Therefore, reusing tell() values
of a binary file to seek() a TextIOWrapper accessing the same file
is wrong.


From solipsis at pitrou.net  Wed Jun 30 18:44:57 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 30 Jun 2010 18:44:57 +0200
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
References: <71728.1277866512@parc.com>
	<AANLkTimUuphnbKMCvrE7geY-3ABPYFMjz-11Cs_Vxst2@mail.gmail.com>
	<68796.1277913609@parc.com>
Message-ID: <20100630184457.10067764@pitrou.net>

On Wed, 30 Jun 2010 09:00:09 PDT
Bill Janssen <janssen at parc.com> wrote:
> 
> So, my question then is, why are these skips "unexpected"?  Seems to me
> that if this is the case, this test will never run on any platform.

You can change the value of the "usepty" option in your buildbot.tac.
(you will also have to restart the buildslave process)

Regards

Antoine.


From guido at python.org  Wed Jun 30 19:03:49 2010
From: guido at python.org (Guido van Rossum)
Date: Wed, 30 Jun 2010 10:03:49 -0700
Subject: [Python-Dev] Mailbox module - timings and functionality changes
In-Reply-To: <20100630184258.473d8535@pitrou.net>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com> 
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org> 
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com> 
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net> 
	<4C2A3341.4010705@holdenweb.com>
	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com> 
	<4C2A5F56.2010700@holdenweb.com>
	<20100630000529.3AA351FF08C@kimball.webabinitio.net> 
	<20100630184258.473d8535@pitrou.net>
Message-ID: <AANLkTilTd7FE0Fsz5LCPGwkM-0nstqMPsx4jHdsuDa6h@mail.gmail.com>

On Wed, Jun 30, 2010 at 9:42 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Tue, 29 Jun 2010 20:05:29 -0400
> "R. David Murray" <rdmurray at bitdance.com> wrote:
>>
>> I would imagine Guido is talking about an io.TextIOWrapper...in other
>> words, take the binary file you've just finished grabbing info
>> from, and reread it as a text file in order to grab the actual
>> message content.
>
> This sounds a bit suboptimal to me (and introduces race conditions if
> e.g. the file is replaced with another one before you reopen it). You
> could instead decode the binary data by yourself, especially if you
> have already stored that data somewhere.

That's why I proposed not reopening but wrapping.

Of course the contents of the file could still change, but that's a
limitation of how the mailbox module works -- it builds a TOC and
expects the file not to change.

> Also, please note that values used by seek() and tell() on
> text I/O are "opaque cookies". While they can happen to match the
> raw binary file position, it is a mere coincidence (or an
> implementation detail, at your will). Therefore, reusing tell() values
> of a binary file to seek() a TextIOWrapper accessing the same file
> is wrong.

Well, um, I actually designed it carefully so that bytes offsets
*would* work as text offsets in those cases where they make sense at
all.

-- 
--Guido van Rossum (python.org/~guido)

From solipsis at pitrou.net  Wed Jun 30 19:20:34 2010
From: solipsis at pitrou.net (Antoine Pitrou)
Date: Wed, 30 Jun 2010 19:20:34 +0200
Subject: [Python-Dev] TextIOWrapper.tell()
In-Reply-To: <AANLkTilTd7FE0Fsz5LCPGwkM-0nstqMPsx4jHdsuDa6h@mail.gmail.com>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com>
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org>
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com>
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net>
	<4C2A3341.4010705@holdenweb.com>
	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com>
	<4C2A5F56.2010700@holdenweb.com>
	<20100630000529.3AA351FF08C@kimball.webabinitio.net>
	<20100630184258.473d8535@pitrou.net>
	<AANLkTilTd7FE0Fsz5LCPGwkM-0nstqMPsx4jHdsuDa6h@mail.gmail.com>
Message-ID: <20100630192034.5740825b@pitrou.net>

On Wed, 30 Jun 2010 10:03:49 -0700
Guido van Rossum <guido at python.org> wrote:
> 
> > Also, please note that values used by seek() and tell() on
> > text I/O are "opaque cookies". While they can happen to match the
> > raw binary file position, it is a mere coincidence (or an
> > implementation detail, at your will). Therefore, reusing tell() values
> > of a binary file to seek() a TextIOWrapper accessing the same file
> > is wrong.
> 
> Well, um, I actually designed it carefully so that bytes offsets
> *would* work as text offsets in those cases where they make sense at
> all.

Ah, this is embarrassing. I always assumed it was an implementation
detail since neither the PEP nor the module docs say otherwise.

PEP 3116 clearly says:

?Unlike with raw I/O, the units for .seek() are not specified - some
implementations (e.g. StringIO) use characters and others (e.g.
TextIOWrapper) use bytes.?

And also:

?.seek(pos: object, whence: int = 0) -> int

    Seek to position pos. If pos is non-zero, it must be a cookie
    returned from .tell() and whence must be zero.?

?it must be a cookie returned from .tell()? here seems to imply that
non-zero values of other origin should not be used.

Regards

Antoine.

From guido at python.org  Wed Jun 30 19:28:10 2010
From: guido at python.org (Guido van Rossum)
Date: Wed, 30 Jun 2010 10:28:10 -0700
Subject: [Python-Dev] TextIOWrapper.tell()
In-Reply-To: <20100630192034.5740825b@pitrou.net>
References: <i0cu24$u02$1@dough.gmane.org>
	<AANLkTimzN1XiYE4LLPWJ6-b_qhxpylQHTwpfjupixNfH@mail.gmail.com> 
	<4C2A0294.3070806@holdenweb.com> <i0d154$b1t$1@dough.gmane.org> 
	<AANLkTimLTw_aLFmTV2vUtpaYFbfH2-1vAgudW1rKHYg5@mail.gmail.com> 
	<i0d46b$n5g$1@dough.gmane.org>
	<20100629165228.GA5350@amk-desktop.matrixgroup.net> 
	<4C2A3341.4010705@holdenweb.com>
	<AANLkTinKb-UxaAMklR5sI2aJ5nSJJQaS4H9nWTil6ZvM@mail.gmail.com> 
	<4C2A5F56.2010700@holdenweb.com>
	<20100630000529.3AA351FF08C@kimball.webabinitio.net> 
	<20100630184258.473d8535@pitrou.net>
	<AANLkTilTd7FE0Fsz5LCPGwkM-0nstqMPsx4jHdsuDa6h@mail.gmail.com> 
	<20100630192034.5740825b@pitrou.net>
Message-ID: <AANLkTim2D1QGry__rAjo8CRM-DWVGkv4WOZY6tyRFtRh@mail.gmail.com>

On Wed, Jun 30, 2010 at 10:20 AM, Antoine Pitrou <solipsis at pitrou.net> wrote:
> On Wed, 30 Jun 2010 10:03:49 -0700
> Guido van Rossum <guido at python.org> wrote:
>>
>> > Also, please note that values used by seek() and tell() on
>> > text I/O are "opaque cookies". While they can happen to match the
>> > raw binary file position, it is a mere coincidence (or an
>> > implementation detail, at your will). Therefore, reusing tell() values
>> > of a binary file to seek() a TextIOWrapper accessing the same file
>> > is wrong.
>>
>> Well, um, I actually designed it carefully so that bytes offsets
>> *would* work as text offsets in those cases where they make sense at
>> all.
>
> Ah, this is embarrassing. I always assumed it was an implementation
> detail since neither the PEP nor the module docs say otherwise.
>
> PEP 3116 clearly says:
>
> ?Unlike with raw I/O, the units for .seek() are not specified - some
> implementations (e.g. StringIO) use characters and others (e.g.
> TextIOWrapper) use bytes.?
>
> And also:
>
> ?.seek(pos: object, whence: int = 0) -> int
>
> ? ?Seek to position pos. If pos is non-zero, it must be a cookie
> ? ?returned from .tell() and whence must be zero.?
>
> ?it must be a cookie returned from .tell()? here seems to imply that
> non-zero values of other origin should not be used.

Guilty as charged. I really did take care that it would work, but
forgot to mention it. I guess we can depend on this property *inside*
the stdlib (as long as there are tests for each piece of code
depending on it that would break if it ever changed) but should not
advertise it widely. Note that it doesn't go the other way -- due to
encoding state, text streams can certainly return cookies that make no
sense to binary streams. But text streams take byte offsets too and do
the best they can. (Obviously if a byte offset points in the middle of
a multibyte character all bets are off.)

The C stdlib has a similar thing -- while AFAIK POSIX lseek() really
is required to return and take byte offsets, this is not required for
fseek() and ftell() according to the C std -- but I think it's still a
pretty safe bet, and I betcha lots of apps are making this assumption.

-- 
--Guido van Rossum (python.org/~guido)

From martin at v.loewis.de  Wed Jun 30 19:29:36 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 30 Jun 2010 19:29:36 +0200
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>
References: <71728.1277866512@parc.com>	<4C2AD511.5020709@v.loewis.de>
	<20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>
Message-ID: <4C2B7F00.2010602@v.loewis.de>

Am 30.06.2010 13:32, schrieb exarkun at twistedmatrix.com:
> On 05:24 am, martin at v.loewis.de wrote:
>>> Seems to work fine.  So this I don't understand.  Any ideas, anyone?
>>
>> Didn't we discuss this before? The buildbot slave has no controlling
>> terminal anymore, hence it cannot open /dev/tty. If you are curious,
>> just patch your checkout to output the exact errno (e.g. to stdout),
>> and trigger a build through the web.
> 
> Could the test be rewritten (or supplemented) to use a pty?  Most or
> perhaps all of the same operations should be supported.

I'm not sure. It uses TIOCGPGRP, basically to establish that ioctl
can also put results into a Python array (IIUC). This goes back to
http://bugs.python.org/555817

Somebody rewriting it would need to make sure the original test purpose
is still met.

Regards,
Martin

From barry at python.org  Wed Jun 30 20:16:14 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 14:16:14 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23D3C2.1060500@scottdial.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
Message-ID: <20100630141614.10dbccde@heresy>

I'm trying to catch up on this thread, so I may collapse some responses or
refer to points others have brought up.

On Jun 24, 2010, at 05:53 PM, Scott Dial wrote:

>If the package has .so files that aren't compatible with other version
>of python, then what is the motivation for placing that in a shared
>location (since it can't actually be shared)?

I think Matthias has described the motivation for the Debian/Ubuntu case, and
James describes Python's current search algorithm for a packages .py[c] and
.so files.  There are a few points that you've made that I want to respond to.

You claim that versioned .so files scheme is "more complicated" than multiple
version-specific search paths (if I understand your counter proposal
correctly).  It all depends on your point of view.  From mine, a 100 line
patch that almost nobody but (some) distros will care about or be affected by,
and that only changes a fairly obscure build-time configuration, is much
simpler than trying to make version-specific search paths work.

If you build Python from source, you do not care about this patch and you'll
never see its effects.  If you get Python on a distribution that only gives
you one version of Python at a time, you also will probably never care or see
the effects of this patch.  If you're a Debian or Ubuntu user who wants to use
Python 3.2 and 3.3, you *might* care about it, but most likely it'll just work
behind the scenes.  If you're a Python packager or work on the Python
infrastructure for one of those platforms, then you will care.

About just sharing the py files.  You say that would be acceptable to you, but
it's actually a pretty big deal.  If you're supporting two versions of Python,
then every distro Python package doubles in size.  Even with compression,
you're talking longer download times and probably more critically, you've
greatly increased CDROM space pressures.  The Ubuntu CDROM is already
essentially at capacity so doubling the size of all Python packages (most of
which btw do not have extension modules) makes such an approach impossible.
Moving to a DVD image has been discussed, but it is currently believed not in
the best interest of users, especially on slow links, to do so at this time.

The versioned .so approach will of course increase the size of packages by
twice the contained .so file size, and that's already an uncomfortable but
acceptable increase.  It's acceptable because of the gain users get by having
multiple versions of Python available and the fact that there aren't nearly as
many extension modules as there are Python files.  Doubling the size of .py
files as well isn't acceptable.

>But the only motivation for doing this with .pyc files is that the .py
>files are able to be shared, since the .pyc is an on-demand-generated,
>version-specific artifact (and not the source). The .so file is created
>offline by another toolchain, is version-specific, and presumably you
>are not suggesting that Python generate it on-demand.

Definitely not.  pyc files are generated upon installation of the distro
package, but of course the .so files must be compiled on a build machine and
included in the distro package.  The whole process is much simpler if the
versioned .so files can just live in the same directory.

>For packages that have .so files, won't the distro already have to build
>multiple copies of that package for all version of Python? So, why can't
>it place them in separate directories that are version-specific at that
>time? This is not the same as placing .py files that are
>version-agnostic into a version-agnostic location.

It's not a matter of "could", it's a matter of simplicity, and I think
versioned .so files are the simplest solution given all the constraints.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/275b8632/attachment-0001.pgp>

From barry at python.org  Wed Jun 30 20:31:05 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 14:31:05 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C266185.7080509@ubuntu.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTinBlqul1q_WKmvL7WDFv8eY3NE3lFEY_tHUH3kI@mail.gmail.com>
	<20100624135119.00b9ac5c@heresy>
	<AANLkTindH5uADbSwan-xWV08YcDaEKI3CleaFjhdmHvX@mail.gmail.com>
	<20100624142830.4c859faf@limelight.wooz.org>
	<20100624164637.22fd9160@heresy> <4C266185.7080509@ubuntu.com>
Message-ID: <20100630143105.37e1225e@heresy>

On Jun 26, 2010, at 10:22 PM, Matthias Klose wrote:

>On 24.06.2010 22:46, Barry Warsaw wrote:
>> So, we could say that PEP 384 compliant extension modules would get written
>> without a version specifier.  IOW, we'd treat foo.so as using the ABI.  It
>> would then be up to the Python runtime to throw ImportErrors if in fact we
>> were loading a legacy, non-PEP 384 compliant extension.
>
>Is it realistic to never break the ABI?  I would think of having the ABI
>encoded in the file name as well, and only bump the ABI if it does change.
>With the "versioned .so files" proposal an ABI bump is necessary with every
>python version, with PEP 384 the ABI bump will be decoupled from the python
>version.

You're right that the ABI will break, requiring a bump, and I think you're
right that this means that PEP 384 compliant shared libraries would have to
have a version number in their file name (assuming the versioned .so proposal
is accepted).

The problem is that we would need two version numbers, one for extension
modules that are not PEP 384 complaint (and thus get bumped for every new
Python version), and one for modules that are PEP 384 compliant (and thus only
get bumped once in a while).  The reason is that I think it will always be the
case that we will have PEP 384 compliant and non-compliant extension modules.

Perhaps identifying the underlying problems will lead to a more acceptable
patch for Python.  My patch tries to take a simple (perhaps too simplistic)
solution, and I'm not married to it, but I think the general idea of versioned
.so files is the right one.

1. The file name extensions that Python searches for are hardcoded and
   compiled in.

dyload_shlib.c hard codes the file name pattern that extension modules must
have in order for Python to load them.  They must be <foo>.so or
<foo>module.so.  This gets compiled into Python at build time and there's no
way for a distro (or anyone else who builds Python from source) to extend the
file name patterns without modifying the source code.

2. The extension that distutils writes for shared libraries is dictated by
   build-time options and cannot be overridden.

When you ./configure Python, autoconf figures out what shared library
extension your platform uses.  It substitutes this into a Makefile variable.
That Makefile gets installed into your system with the base Python package and
distutils parses the Makefile looking for this variable.  When distutils calls
your platform compiler, it uses this Makefile variable as the file name
extension to use for your shared library.  You cannot change this or override
it to get distutils to write some other file name extension, well.

Of these two problems, #1 is more serious because we have to modify the Python
source code to hack in addition shared library search suffixes.  #2 can be
worked around by renaming the .so file after the build.  The disadvantage of
this though is that if you're a local packager, you'll have to remember to do
the same thing if you want multiple Python version support, because distutils
won't take care of it for you.

Maybe that's okay, in which case it would still be good to address #1.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/e2595e84/attachment.pgp>

From barry at python.org  Wed Jun 30 20:39:50 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 14:39:50 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C246E81.3020302@scottdial.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
Message-ID: <20100630143950.60da41f7@heresy>

On Jun 25, 2010, at 04:53 AM, Scott Dial wrote:

>My suggestion was that a package that contains .so files should not be
>shared (e.g., the entire lxml package should be placed in a
>version-specific path).

Matthias outlined some of the pitfalls with this approach.

>The motivation for this PEP was to simplify the installation python packages
>for distros; it was not to reduce the number of .py files on the disk.

As others have pointed out, versioned so files is not part of PEP 3147.  That
PEP does reduce the number of py files on disk, which as I explained in a
previous follow, is an important consideration.

>Placing .so files together does not simplify that install process in any
>way.

I disagree of course. :)

>You will still have to handle such packages in a special way. You must still
>compile the package multiple times for each relevant version of python (with
>special tagging that I imagine distutils can take care of) and, worse yet,

No, distutils cannot take care of this.  There is no way currently to tell
distutils to generate a .so file with anything but the platform-specific way
of spelling "shared library".

>you have created a more trick install than merely having multiple search
>paths (e.g., installing/uninstalling lxml for *one* version of python is
>actually more difficult in this scheme).

That's not a use case we care about.  If you have Python 3.2 and 3.3 installed
on your system, why would you want lxml installed for one but not the other?
And even if for some reason you did, the only way to do that would be in a way
similar to handling the PEP 3147 pyc files.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/f14ae2a8/attachment.pgp>

From exarkun at twistedmatrix.com  Wed Jun 30 20:46:02 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Wed, 30 Jun 2010 18:46:02 -0000
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <20100630184457.10067764@pitrou.net>
References: <71728.1277866512@parc.com>
	<AANLkTimUuphnbKMCvrE7geY-3ABPYFMjz-11Cs_Vxst2@mail.gmail.com>
	<68796.1277913609@parc.com> <20100630184457.10067764@pitrou.net>
Message-ID: <20100630184602.1937.1550858232.divmod.xquotient.569@localhost.localdomain>


On 04:44 pm, solipsis at pitrou.net wrote:
>On Wed, 30 Jun 2010 09:00:09 PDT
>Bill Janssen <janssen at parc.com> wrote:
>>
>>So, my question then is, why are these skips "unexpected"?  Seems to 
>>me
>>that if this is the case, this test will never run on any platform.
>
>You can change the value of the "usepty" option in your buildbot.tac.
>(you will also have to restart the buildslave process)

But don't do this.  The usepty option is completely unrelated to the 
suggestion I was making.  Flipping it to True will only cause other 
things to break and have no impact on this test.

Jean-Paul

From exarkun at twistedmatrix.com  Wed Jun 30 20:49:54 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Wed, 30 Jun 2010 18:49:54 -0000
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <69334.1277915184@parc.com>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
	<20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>
	<69334.1277915184@parc.com>
Message-ID: <20100630184954.1937.1956849777.divmod.xquotient.577@localhost.localdomain>

On 04:26 pm, janssen at parc.com wrote:
>exarkun at twistedmatrix.com wrote:
>>Could the test be rewritten (or supplemented) to use a pty?  Most or
>>perhaps all of the same operations should be supported.
>
>Buildbot seems to be explicitly not using a PTY.  From the the top of
>the test output:
>
>make buildbottest
>in dir /Users/buildbot/buildarea/trunk.parc-leopard-1/build (timeout 
>1800 secs)
>watching logfiles {}
>argv: ['make', 'buildbottest']
>[...]
>closing stdin
>using PTY: False

This output is telling you that the build slave isn't giving the child 
processes it creates a pty.  What I had in mind was writing the test to 
create a new pty, instead of trying to use the controlling tty.  So 
basically, the two things are completely unrelated and this buildbot 
configuration isn't hurting anything (and in fact is likely helping 
quite a few things, so I suggest leaving it alone).
>
>I believe this is specified by the build master.
>
>This test seems to work on Ubuntu and FreeBSD, though.

That's interesting.  I wonder if those slaves are able to open /dev/tty 
for some reason?  The slave is supposed to detach from the controlling 
terminal when it daemonizes.  There could be a bug in that code, I 
suppose, or the slaves could be running without daemonization for some 
reason.  The operators would have to tell us about that, I think.  Or, 
another possibility is that /dev/tty doesn't work how I expect it to and 
on Ubuntu and FreeBSD it can be opened even if you don't have a 
controlling terminal.  Hopefully not, though.

Jean-Paul

From barry at python.org  Wed Jun 30 20:53:29 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 14:53:29 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C268433.30405@scottdial.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
	<A9C41BCC-1D14-44AD-A868-61579B718278@fuhm.net>
	<4C265DC6.4080600@ubuntu.com> <4C268433.30405@scottdial.com>
Message-ID: <20100630145329.736f2aab@heresy>

On Jun 26, 2010, at 06:50 PM, Scott Dial wrote:

>On 6/26/2010 4:06 PM, Matthias Klose wrote:
>> On 25.06.2010 22:12, James Y Knight wrote:
>>> On Jun 25, 2010, at 4:53 AM, Scott Dial wrote:
>>>> Placing .so files together does not simplify that install process in any
>>>> way. You will still have to handle such packages in a special way.
>>>
>>> This is a good point, but I think still falls short of a solution. For a
>>> package like lxml, indeed you are correct. Since debian needs to build
>>> it once per version, it could just put the entire package (.py files and
>>> .so files) into a different per-python-version directory.
>> 
>> This is what is currently done.  This will increase the size of packages
>> by duplicating the .py files, or you have to install the .py in a common
>> location (irrelevant to sys.path), and provide (sym)links to the
>> expected location.
>
>"This is what is currently done"  and "provide (sym)links to the
>expected location" are conflicting statements.

I think Matthias was referring to "what is currently done" to your statement
"debian needs to build it once per version".  Providing symlinks is how we are
able to make it appear that there are version-specific py files without
actually doing so.

>If you are symlinking .py files from a shared location, then that is not the
>same as "just install the package into a version-specific location". What
>motivation is there for preferring symlinks?

This reduces .py file duplications in distro packages.

>Who cares if a ditro package install yields duplicate .py files? Nor am
>I motivated by having to carry duplicate .py files in a distribution
>package (I imagine the compression of duplicate .py files is amazing).

It might be amazing, but it's still a significant overhead.  As I've
described, multiply that by all the py files in all the distro packages
containing Python source code, and then still try to fit it on a CDROM.

>What happens to the distro packaging if a python package splits the
>codebase between 2.x and 3.x (meaning they have distinct .py files)?

The Debian/Ubuntu approach to Python 2/3 support is to provide them in
separate distro packages.  E.g. for Python package foo, you would have Debuntu
package python-foo (for the Python 2.x version) and python3-foo.  We do not
share source between Python 2 and 3 versions, at least not yet <wink>.  This
doesn't hurt us much because the number of Python packages that are source
compatible between the two is pretty low (Benjamin's 'six' package might
change that :), and not much depends on Python 3 yet.

>As someone else mentioned, how is virtualenv going to interact with packages
>that install like this?

This is a good question, but I *think* it won't affect it much at all.  To
test for sure I'd either need a Python 3 compatible virtualenv or backport my
patch to Python 2.6 and 2.7.  But still, I'm not sure it would matter since
the same shared library import suffix is used in either case.  I actually
think version-specific search paths would have a greater impact on
virtualenv.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/c3175d50/attachment.pgp>

From barry at python.org  Wed Jun 30 20:55:16 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 14:55:16 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
	<AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
Message-ID: <20100630145516.08b5b2ec@heresy>

On Jun 25, 2010, at 11:58 AM, Brett Cannon wrote:

>> Placing .so files together does not simplify that install process in any
>> way. You will still have to handle such packages in a special way. You
>> must still compile the package multiple times for each relevant version
>> of python (with special tagging that I imagine distutils can take care
>> of) and, worse yet, you have created a more trick install than merely
>> having multiple search paths (e.g., installing/uninstalling lxml for
>> *one* version of python is actually more difficult in this scheme).
>
>This is meant to be used by distros in a programmatic fashion, so my
>response is "so what?" Their package management system is going to
>maintain the directory, not a person. You and I are not going to be
>using this for anything. This is purely meant for Linux OS vendors
>(maybe OS X) to manage their installs through their package software.
>I honestly do not expect human beings to be mucking around with these
>installs (and I suspect Barry doesn't either).

Spot on.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/1c85b7b0/attachment.pgp>

From barry at python.org  Wed Jun 30 20:58:00 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 14:58:00 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C266702.4010102@ubuntu.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
	<AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
	<4C266702.4010102@ubuntu.com>
Message-ID: <20100630145800.7658936e@heresy>

On Jun 26, 2010, at 10:45 PM, Matthias Klose wrote:

>Having non-conflicting extension names is a schema which already is used on
>some platforms (debug builds on Windows).  The question for me is, if just a
>renaming of the .so files is acceptable for upstream, or if distributors
>should implement this on their own, as something like:
>
>   if ext_path.startswith('/usr/') and not ext_path.startswith('/usr/local/'):
>     load_ext('foo.2.6.so')
>   else:
>     load_ext('foo.so')
>
>I fear this will cause issues when e.g. virtualenv environments start copying
>parts from the system installation instead of symlinking it.

I concur.  I think my patch will have much less impact on virtualenv and
similar tools because there's nothing much magical about it.  It just says "oh
there's another file suffix you should consider when looking for a shared
library", which as you point out is already done on Windows.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/1f2ea9ed/attachment.pgp>

From barry at python.org  Wed Jun 30 21:03:28 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 15:03:28 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C2506AE.3060002@scottdial.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<D03C8909-2841-4B2E-A5A2-D1C1D65C2B7F@fuhm.net>
	<4C246E81.3020302@scottdial.com>
	<AANLkTikH4crFhnoaZ_U2NXYRxZp6o2_CZZwmwb5dT08n@mail.gmail.com>
	<4C2506AE.3060002@scottdial.com>
Message-ID: <20100630150328.281f5d5f@heresy>

On Jun 25, 2010, at 03:42 PM, Scott Dial wrote:

>On 6/25/2010 2:58 PM, Brett Cannon wrote:
>> I assume you are talking about PEP 3147. You're right that the PEP was
>> for pyc files and that's it. No one is talking about rewriting the
>> PEP.
>
>Yes, I am making reference to PEP 3147. I make reference to that PEP
>because this change is of the same order of magnitude as the .pyc
>change, and we asked for a PEP for that, and if this .so stuff is an
>extension of that thought process, then it should either be reflected by
>that PEP or a new PEP.

I think it's not nearly on the order of magnitude as PEP 3147.  One way to
measure that is the size of the patch required to implement the feature and
ensure the test suite still works.  My versioned so patch is *way* smaller.

I actually think because this is almost exclusively an extension to a
build-time configuration option, and doesn't really change the language, a PEP
shouldn't be necessary.  But by the same token, I'm willing to write a new one
(and *not* touch PEP 3147) just so that we have a point of reference to record
the discussion and decision.  So I'll do that.

-Barry
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/6a57c097/attachment.pgp>

From barry at python.org  Wed Jun 30 21:06:10 2010
From: barry at python.org (Barry Warsaw)
Date: Wed, 30 Jun 2010 15:06:10 -0400
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <4C23DD99.9050604@egenix.com>
References: <20100624115048.4fd152e3@heresy>
	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>
	<20100624170944.7e68ad21@heresy> <4C23D3C2.1060500@scottdial.com>
	<4C23DD99.9050604@egenix.com>
Message-ID: <20100630150610.7ae4ac6a@heresy>

On Jun 25, 2010, at 12:35 AM, M.-A. Lemburg wrote:

>Scott Dial wrote:
>> On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>>>> What use case does this address?
>>>
>>>> If you want to make it so a system can install a package in just one
>>>> location to be used by multiple Python installations, then the version
>>>> number isn't enough.  You also need to distinguish debug builds, profiling
>>>> builds, Unicode width (see issue8654), and probably several other
>>>> ./configure options.
>>>
>>> This is a good point, but more easily addressed.  Let's say a distro makes
>>> three Python 3.2 variants available, one "normal" build, a debug build, and
>>> UCS2 and USC4 versions of the above.  All we need to do is choose a different
>>> .so ABI tag (see previous follow) for each of those builds.  My updated patch
>>> (coming soon) allows you to define that tag to configure.  So e.g.
>> 
>> Why is this use case not already addressed by having independent
>> directories? And why is there an incentive to co-mingle these
>> version-punned files with version-agnostic ones?
>
>I don't think this is a good idea. After a while your Python
>lib directories would need some serious dusting off to make them
>maintainable again.
>
>Disk space is cheap so setting up dedicated directories for each
>variant will result in a much easier to manage installation.
>
>If you want a really clever setup, use hard links between those
>directory (you can also use symlinks if you like).
>Then a change in one Python file will automatically
>propagate to all other variant dirs without any maintenance
>effort. Together with PYTHONHOME this makes a really nice
>virtualenv-like environment.

Note that I do believe there is a difference between what users maintaining
their own Python installations might want, and what a distro needs to maintain
its entire Python stack.  So while dedicated directories might make more sense
if you're maintaining your own Python built from source, it doesn't make as
much sense for a distro, as described in previous responses by Matthias.

-Barry

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 836 bytes
Desc: not available
URL: <http://mail.python.org/pipermail/python-dev/attachments/20100630/3094f5cb/attachment-0001.pgp>

From exarkun at twistedmatrix.com  Wed Jun 30 21:10:05 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Wed, 30 Jun 2010 19:10:05 -0000
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <4C2B7F00.2010602@v.loewis.de>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
	<20100630113232.1937.151974582.divmod.xquotient.556@localhost.localdomain>
	<4C2B7F00.2010602@v.loewis.de>
Message-ID: <20100630191005.1937.1474314461.divmod.xquotient.617@localhost.localdomain>

On 05:29 pm, martin at v.loewis.de wrote:
>Am 30.06.2010 13:32, schrieb exarkun at twistedmatrix.com:
>>On 05:24 am, martin at v.loewis.de wrote:
>>>>Seems to work fine.  So this I don't understand.  Any ideas, anyone?
>>>
>>>Didn't we discuss this before? The buildbot slave has no controlling
>>>terminal anymore, hence it cannot open /dev/tty. If you are curious,
>>>just patch your checkout to output the exact errno (e.g. to stdout),
>>>and trigger a build through the web.
>>
>>Could the test be rewritten (or supplemented) to use a pty?  Most or
>>perhaps all of the same operations should be supported.
>
>I'm not sure. It uses TIOCGPGRP, basically to establish that ioctl
>can also put results into a Python array (IIUC). This goes back to
>http://bugs.python.org/555817
>
>Somebody rewriting it would need to make sure the original test purpose
>is still met.

Absolutely.  And even so, it may still make sense to run the test 
against both /dev/tty and a pty (or whatever subset of those things can 
be acquired in the testing environment).

You can do a TIOCGPGRP on a new pty (created by os.openpty) but it 
produces somewhat less interesting results than doing it on /dev/tty. 
FIONREAD might be a nice alternative.  It produces interesting (ie, non- 
zero) values in an easily predictable/controllable way (it tells you how 
many bytes are in the read buffer).

Jean-Paul

From exarkun at twistedmatrix.com  Wed Jun 30 21:11:22 2010
From: exarkun at twistedmatrix.com (exarkun at twistedmatrix.com)
Date: Wed, 30 Jun 2010 19:11:22 -0000
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <20100630184602.1937.1550858232.divmod.xquotient.569@localhost.localdomain>
References: <71728.1277866512@parc.com>
	<AANLkTimUuphnbKMCvrE7geY-3ABPYFMjz-11Cs_Vxst2@mail.gmail.com>
	<68796.1277913609@parc.com> <20100630184457.10067764@pitrou.net>
	<20100630184602.1937.1550858232.divmod.xquotient.569@localhost.localdomain>
Message-ID: <20100630191122.1937.493523511.divmod.xquotient.619@localhost.localdomain>

On 06:46 pm, exarkun at twistedmatrix.com wrote:
>
>On 04:44 pm, solipsis at pitrou.net wrote:
>>On Wed, 30 Jun 2010 09:00:09 PDT
>>Bill Janssen <janssen at parc.com> wrote:
>>>
>>>So, my question then is, why are these skips "unexpected"?  Seems to 
>>>me
>>>that if this is the case, this test will never run on any platform.
>>
>>You can change the value of the "usepty" option in your buildbot.tac.
>>(you will also have to restart the buildslave process)
>
>But don't do this.  The usepty option is completely unrelated to the 
>suggestion I was making.  Flipping it to True will only cause other 
>things to break and have no impact on this test.

Ah, sorry.  I confused myself.  The option is related.  But it will also 
break other things, so I still would recommend looking for other 
solutions.

Jean-Paul

From brett at python.org  Wed Jun 30 21:28:03 2010
From: brett at python.org (Brett Cannon)
Date: Wed, 30 Jun 2010 12:28:03 -0700
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <68821.1277913795@parc.com>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de> 
	<68821.1277913795@parc.com>
Message-ID: <AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com>

On Wed, Jun 30, 2010 at 09:03, Bill Janssen <janssen at parc.com> wrote:
> Martin v. L?wis <martin at v.loewis.de> wrote:
>
>> > Seems to work fine. ?So this I don't understand. ?Any ideas, anyone?
>>
>> Didn't we discuss this before?
>
> Possibly, but I don't recall doing so.
>
>> The buildbot slave has no controlling
>> terminal anymore, hence it cannot open /dev/tty. If you are curious,
>> just patch your checkout to output the exact errno (e.g. to stdout),
>> and trigger a build through the web.
>
> So, why is skipping this test "unexpected"? ?I see "x86 Tiger" is also
> showing this as an unexpected skip. ?Should I just add it to the list of
> expected skips on Darwin? ?Actually, will it run on any platform?

The whole "unexpected" skipping is somewhat of a mess. In an ideal
situation modules that are optionally built should be allowed to skip,
and on a per-platform basis certain OS-specific tests (whether they be
exclusive to a specific OS or run on all OSs except Windows) should be
skipped. Otherwise any import failure should be a test failure. The
"unexpected" test skipping was meant to solve both of these
situations, but in an imperfect way.

My PSF grant proposal to work on Python full-time for two to three
months after my Ph.D. is complete (assuming the PSF gives me the grant
this would start most likely in November or December) includes
cleaning up the test suite and this would be the first thing I tackle.

From martin at v.loewis.de  Wed Jun 30 21:53:08 2010
From: martin at v.loewis.de (=?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?=)
Date: Wed, 30 Jun 2010 21:53:08 +0200
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
	<68821.1277913795@parc.com>
	<AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com>
Message-ID: <4C2BA0A4.9070002@v.loewis.de>

> The whole "unexpected" skipping is somewhat of a mess. In an ideal
> situation modules that are optionally built should be allowed to skip,

While this may be the wide-spread interpretation, it is definitely *not*
the original intention of the feature.

When Tim Peters added it, he wanted it to tell him whether he did the
Windows build correctly, INCLUDING ALL OPTIONAL PACKAGES that can
possibly work on Windows. If you try to generalize this beyond Windows,
then the only skips that are expected are the ones for tests that
absolutely cannot work on the platform - i.e. Unix tests on Windows,
and Windows tests on Unix. Otherwise, if you can get it to pass by
installing additional software, Tim did *not* mean this to be an
expected skip.

Regards,
Martin

From janssen at parc.com  Wed Jun 30 22:21:51 2010
From: janssen at parc.com (Bill Janssen)
Date: Wed, 30 Jun 2010 13:21:51 PDT
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <4C2BA0A4.9070002@v.loewis.de>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
	<68821.1277913795@parc.com>
	<AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com>
	<4C2BA0A4.9070002@v.loewis.de>
Message-ID: <76469.1277929311@parc.com>

Martin v. L?wis <martin at v.loewis.de> wrote:

> > The whole "unexpected" skipping is somewhat of a mess. In an ideal
> > situation modules that are optionally built should be allowed to skip,
> 
> While this may be the wide-spread interpretation, it is definitely *not*
> the original intention of the feature.
> 
> When Tim Peters added it, he wanted it to tell him whether he did the
> Windows build correctly, INCLUDING ALL OPTIONAL PACKAGES that can
> possibly work on Windows. If you try to generalize this beyond Windows,
> then the only skips that are expected are the ones for tests that
> absolutely cannot work on the platform - i.e. Unix tests on Windows,
> and Windows tests on Unix. Otherwise, if you can get it to pass by
> installing additional software, Tim did *not* mean this to be an
> expected skip.

Perfectly reasonable, good to know.  So on my OS X buildbots I should
update gdb, tcl/tk, and readline, so that those tests can run.

Probably be good to put a note in the regrtest.py comments to this
effect, as I don't see a PEP about testing or buildbots.

Bill

From mal at egenix.com  Wed Jun 30 22:35:56 2010
From: mal at egenix.com (M.-A. Lemburg)
Date: Wed, 30 Jun 2010 22:35:56 +0200
Subject: [Python-Dev] versioned .so files for Python 3.2
In-Reply-To: <20100630150610.7ae4ac6a@heresy>
References: <20100624115048.4fd152e3@heresy>	<AANLkTikRcPcFdKteuMmdMR5LI1T5QrmfGCQbHdf_lYB3@mail.gmail.com>	<20100624170944.7e68ad21@heresy>
	<4C23D3C2.1060500@scottdial.com>	<4C23DD99.9050604@egenix.com>
	<20100630150610.7ae4ac6a@heresy>
Message-ID: <4C2BAAAC.5090101@egenix.com>

Barry Warsaw wrote:
> On Jun 25, 2010, at 12:35 AM, M.-A. Lemburg wrote:
> 
>> Scott Dial wrote:
>>> On 6/24/2010 5:09 PM, Barry Warsaw wrote:
>>>>> What use case does this address?
>>>>
>>>>> If you want to make it so a system can install a package in just one
>>>>> location to be used by multiple Python installations, then the version
>>>>> number isn't enough.  You also need to distinguish debug builds, profiling
>>>>> builds, Unicode width (see issue8654), and probably several other
>>>>> ./configure options.
>>>>
>>>> This is a good point, but more easily addressed.  Let's say a distro makes
>>>> three Python 3.2 variants available, one "normal" build, a debug build, and
>>>> UCS2 and USC4 versions of the above.  All we need to do is choose a different
>>>> .so ABI tag (see previous follow) for each of those builds.  My updated patch
>>>> (coming soon) allows you to define that tag to configure.  So e.g.
>>>
>>> Why is this use case not already addressed by having independent
>>> directories? And why is there an incentive to co-mingle these
>>> version-punned files with version-agnostic ones?
>>
>> I don't think this is a good idea. After a while your Python
>> lib directories would need some serious dusting off to make them
>> maintainable again.
>>
>> Disk space is cheap so setting up dedicated directories for each
>> variant will result in a much easier to manage installation.
>>
>> If you want a really clever setup, use hard links between those
>> directory (you can also use symlinks if you like).
>> Then a change in one Python file will automatically
>> propagate to all other variant dirs without any maintenance
>> effort. Together with PYTHONHOME this makes a really nice
>> virtualenv-like environment.
> 
> Note that I do believe there is a difference between what users maintaining
> their own Python installations might want, and what a distro needs to maintain
> its entire Python stack.  So while dedicated directories might make more sense
> if you're maintaining your own Python built from source, it doesn't make as
> much sense for a distro, as described in previous responses by Matthias.

Fair enough.

I haven't followed the thread closely, so Matthias will probably
already have answered this:

The Python default installation dir for
libs (including site-packages) is $prefix/lib/pythonX.X, so you
already have separate and properly versioned directory paths.

What difference would the extra version on the .so file make in
such a setup ?

-- 
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source  (#1, Jun 30 2010)
>>> Python/Zope Consulting and Support ...        http://www.egenix.com/
>>> mxODBC.Zope.Database.Adapter ...             http://zope.egenix.com/
>>> mxODBC, mxDateTime, mxTextTools ...        http://python.egenix.com/
________________________________________________________________________
2010-07-19: EuroPython 2010, Birmingham, UK                18 days to go

::: Try our new mxODBC.Connect Python Database Interface for free ! ::::


   eGenix.com Software, Skills and Services GmbH  Pastor-Loeh-Str.48
    D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
           Registered at Amtsgericht Duesseldorf: HRB 46611
               http://www.egenix.com/company/contact/

From brett at python.org  Wed Jun 30 23:12:59 2010
From: brett at python.org (Brett Cannon)
Date: Wed, 30 Jun 2010 14:12:59 -0700
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <4C2BA0A4.9070002@v.loewis.de>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de> 
	<68821.1277913795@parc.com>
	<AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com> 
	<4C2BA0A4.9070002@v.loewis.de>
Message-ID: <AANLkTiml71IvQ0PnLO5zCB7ckVJN0kkAUtMk9ew8fJ4N@mail.gmail.com>

On Wed, Jun 30, 2010 at 12:53, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> The whole "unexpected" skipping is somewhat of a mess. In an ideal
>> situation modules that are optionally built should be allowed to skip,
>
> While this may be the wide-spread interpretation, it is definitely *not*
> the original intention of the feature.
>
> When Tim Peters added it, he wanted it to tell him whether he did the
> Windows build correctly, INCLUDING ALL OPTIONAL PACKAGES that can
> possibly work on Windows. If you try to generalize this beyond Windows,
> then the only skips that are expected are the ones for tests that
> absolutely cannot work on the platform - i.e. Unix tests on Windows,
> and Windows tests on Unix. Otherwise, if you can get it to pass by
> installing additional software, Tim did *not* mean this to be an
> expected skip.

Interesting. Do you use it that way when you make the Windows build?

From ncoghlan at gmail.com  Wed Jun 30 23:52:30 2010
From: ncoghlan at gmail.com (Nick Coghlan)
Date: Thu, 1 Jul 2010 07:52:30 +1000
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <4C2BA0A4.9070002@v.loewis.de>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de>
	<68821.1277913795@parc.com>
	<AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com>
	<4C2BA0A4.9070002@v.loewis.de>
Message-ID: <AANLkTikkElV9E3Ei7TXX7yxcl2YImv_oI-3eCRQXRxlO@mail.gmail.com>

On Thu, Jul 1, 2010 at 5:53 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
> When Tim Peters added it, he wanted it to tell him whether he did the
> Windows build correctly, INCLUDING ALL OPTIONAL PACKAGES that can
> possibly work on Windows. If you try to generalize this beyond Windows,
> then the only skips that are expected are the ones for tests that
> absolutely cannot work on the platform - i.e. Unix tests on Windows,
> and Windows tests on Unix. Otherwise, if you can get it to pass by
> installing additional software, Tim did *not* mean this to be an
> expected skip.

Note that it works this way on Linux as well. On Kubuntu (for example)
you need another half dozen or so additional *-dev packages installed
to avoid unexpected test skips.

Cheers,
Nick.

P.S. For anyone curious, I posted the list of extra packages you need
here: http://boredomandlaziness.blogspot.com/2010/01/kubuntu-dev-packages-to-build-python.html

-- 
Nick Coghlan   |   ncoghlan at gmail.com   |   Brisbane, Australia

From brett at python.org  Wed Jun 30 23:55:14 2010
From: brett at python.org (Brett Cannon)
Date: Wed, 30 Jun 2010 14:55:14 -0700
Subject: [Python-Dev] OS X buildbots: why am I skipping these tests?
In-Reply-To: <AANLkTikkElV9E3Ei7TXX7yxcl2YImv_oI-3eCRQXRxlO@mail.gmail.com>
References: <71728.1277866512@parc.com> <4C2AD511.5020709@v.loewis.de> 
	<68821.1277913795@parc.com>
	<AANLkTimMiG_N7ATdhNxxvKKwhUJH9iIUSclf9HDYz-vi@mail.gmail.com> 
	<4C2BA0A4.9070002@v.loewis.de>
	<AANLkTikkElV9E3Ei7TXX7yxcl2YImv_oI-3eCRQXRxlO@mail.gmail.com>
Message-ID: <AANLkTiknrNQMY1sfVw5Y1qQ_pkX4JyTEH70KmB6fFapj@mail.gmail.com>

On Wed, Jun 30, 2010 at 14:52, Nick Coghlan <ncoghlan at gmail.com> wrote:
> On Thu, Jul 1, 2010 at 5:53 AM, "Martin v. L?wis" <martin at v.loewis.de> wrote:
>> When Tim Peters added it, he wanted it to tell him whether he did the
>> Windows build correctly, INCLUDING ALL OPTIONAL PACKAGES that can
>> possibly work on Windows. If you try to generalize this beyond Windows,
>> then the only skips that are expected are the ones for tests that
>> absolutely cannot work on the platform - i.e. Unix tests on Windows,
>> and Windows tests on Unix. Otherwise, if you can get it to pass by
>> installing additional software, Tim did *not* mean this to be an
>> expected skip.
>
> Note that it works this way on Linux as well. On Kubuntu (for example)
> you need another half dozen or so additional *-dev packages installed
> to avoid unexpected test skips.

So it isn't that it's "unexpected", it's that a dependency is missing.
So it seems the terminology needs to get tweaked.