[Python-3000] Questions about email bytes/str (python 3000)

Victor Stinner victor.stinner at haypocalc.com
Tue Aug 14 04:22:36 CEST 2007


Hi,

After many tests, I'm unable to convert email module to Python 3000. I'm also 
unable to take decision of the best type for some contents.



(1) Email parts should be stored as byte or character string?

Related methods: Generator class, Message.get_payload(), Message.as_string().

Let's take an example: multipart (MIME) email with latin-1 and base64 (ascii) 
sections. Mix latin-1 and ascii => mix bytes. So the best type should be 
bytes.

=> bytes



(2) Parsing file (raw string): use bytes or str in parsing?

The parser use methods related to str like splitlines(), lower(), strip(). But 
it should be easy to rewrite/avoid these methods. I think that low-level 
parsing should be done on bytes. At the end, or when we know the charset, we 
can convert to str.

=> bytes



About base64, I agree with Bill Janssen:
 - base64MIME.decode converts string to bytes
 - base64MIME.encode converts bytes to string

But decode may accept bytes as input (as base64 modules does): use 
str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict').


I wrote 4 differents (non-working) patches. So I you want to work on email 
module and Python 3000, please first contact me. When I will get a better 
patch, I will submit it.


Victor Stinner aka haypo
http://hachoir.org/


More information about the Python-3000 mailing list