[Python-Dev] email package status in 3.X

P.J. Eby pje at telecommunity.com
Mon Jun 21 03:58:22 CEST 2010


At 08:08 AM 6/21/2010 +1000, Nick Coghlan wrote:
>Perhaps if people could identify which specific string methods are
>causing problems?

__getitem__(int) returns an integer rather than a bytestring, so 
anything that manipulates individual characters can't be given bytes 
and have it work.

That was one of the key differences I had in mind for a bstr type, 
apart from  designing it to coerce normal strings to bstrs in 
cross-type operations, and to allow O(1) "conversion" to/from bytes.

Another randomly chosen byte/string incompatibility (Python 3.1; I 
don't have 3.2 handy at the moment):

 >>> os.path.join(b'x','y')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "c:\Python31\lib\ntpath.py", line 161, in join
     if b[:1] in seps:
TypeError: Type str doesn't support the buffer API

 >>> os.path.join('x',b'y')
Traceback (most recent call last):
   File "<stdin>", line 1, in <module>
   File "c:\Python31\lib\ntpath.py", line 161, in join
     if b[:1] in seps:
TypeError: 'in <string>' requires string as left operand, not bytes

Ironically, it seems to me that in trying to make the type 
distinction more rigid, Py3K fails in this area precisely because it 
is not a rigidly typed language in the Java or Haskell sense: i.e., 
os.path.join doesn't say, "I need two stringlike objects of the *same 
type*", not even in its docstring.

At least in Java, you would either implement a "path" type with 
coercions from bytes and strings, or you'd have a class with 
overloaded methods for handling join operations on bytes and strings, 
respectively, thereby avoiding this whole mess.

(Alas, this little example on the 'in' operator also shows that my 
bstr effort would probably fail anyway, because there's no 
'__rcontains__' (__lcontains__?) to allow it to override the str 
type's __contains__.)



More information about the Python-Dev mailing list