From victor.stinner at haypocalc.com Thu Aug 9 02:41:08 2007 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Thu, 9 Aug 2007 02:41:08 +0200 Subject: [Email-SIG] fix email module for python 3000 (bytes/str) Message-ID: <200708090241.08369.victor.stinner@haypocalc.com> (This email was first sent to python-3000 mailing list. Guido van Rossum proposed me to send it to email-sig and that's what I do :-)) Hi, I started to work on email module to port it for Python 3000, but I have trouble to understand if a function should returns bytes or str (because I don't know email module). Header.encode() -> bytes? Message.as_string() -> bytes? decode_header() -> list of (bytes, str|None) or (str, str|None)? base64MIME.encode() -> bytes? message_from_string() <- bytes? Message.get_payload() -> bytes or str? A charset name type is str, right? --------------- Things to change to get bytes: - replace StringIO with BytesIO - add 'b' prefix, eg. '' becomes b'' - replace "%s=%s" % (x, y) with b''.join((x, b'=', y)) => is it the best method to concatenate bytes? Problems (to port python 2.x code to 3000): - When obj.lower() is used, I expect obj to be str but it's bytes - obj.strip() doesn't work when obj is a byte, it requires an argument but I don't know the right value! Maybe b'\n\r\v\t '? - iterate on a bytes object gives number and not bytes object, eg. for c in b"small text": if re.match("(\n|\r)", c): ... Is it possible to 'bytes' regex? re.compile(b"x") raise an exception -- Victor Stinner aka haypo http://hachoir.org/ From victor.stinner at haypocalc.com Sat Aug 11 01:49:10 2007 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Sat, 11 Aug 2007 01:49:10 +0200 Subject: [Email-SIG] fix email module for python 3000 (bytes/str) In-Reply-To: <200708090241.08369.victor.stinner@haypocalc.com> References: <200708090241.08369.victor.stinner@haypocalc.com> Message-ID: <200708110149.10939.victor.stinner@haypocalc.com> Hi, On Thursday 09 August 2007 02:41:08 Victor Stinner wrote: > I started to work on email module to port it for Python 3000, but I have > trouble to understand if a function should returns bytes or str (because I > don't know email module). It's really hard to convert email module to Python 3000 because it does mix byte strings and (unicode) character strings... I wrote some notes about bytes/str helping people to migrate Python 2.x code to Python 3000, or at least to explain the difference between Python 2.x "str" type and Python 3000 "bytes" type: http://wiki.python.org/moin/BytesStr About email module, some deductions: test_email.py: openfile() must use 'rb' file mode for all tests base64MIME.decode() and base64MIME.encode() should accept bytes and str base64MIME.decode() result type is bytes base64MIME.encode() result type should be... bytes or str, no idea Other decode() and encode() functions should use same rules about types. Python modules (binascii and base64) choosed bytes type for encode result. Victor Stinner aka haypo http://hachoir.org/ From barry at python.org Sun Aug 12 16:50:05 2007 From: barry at python.org (Barry Warsaw) Date: Sun, 12 Aug 2007 09:50:05 -0500 Subject: [Email-SIG] [Python-3000] fix email module for python 3000 (bytes/str) In-Reply-To: <200708110149.10939.victor.stinner@haypocalc.com> References: <200708090241.08369.victor.stinner@haypocalc.com> <200708110149.10939.victor.stinner@haypocalc.com> Message-ID: <8B640CF2-EB88-45A5-A85F-1267AF24749E@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 10, 2007, at 6:49 PM, Victor Stinner wrote: > It's really hard to convert email module to Python 3000 because it > does mix > byte strings and (unicode) character strings... Indeed, but I'm making progress. Just a very quick follow up now, with hopefully more detail soon. I'm cross posting this one on purpose because of a couple of more general py3k issues involved. In r56957 I committed changes to sndhdr.py and imghdr.py so that they compare what they read out of the files against proper byte literals. AFAICT, neither module has a unittest, and if you run them from the command line, you'll see that they're completely broken (without my fix). The email package uses these to guess content type subparts for the MIMEAudio and MIMEImage subclasses. I didn't add unittests, just some judicious 'b' prefixes, and a quick command line test seems to make the situation better. This also makes a bunch of email unittests pass. Another general Python thing that bit me was when an exception gets raised with a non-ascii message, e.g. >>> raise RuntimeError('oops') Traceback (most recent call last): File "", line 1, in RuntimeError: oops >>> raise RuntimeError('oo\xfcps') Traceback (most recent call last): File "", line 1, in >>> Um, what? (I'm using a XEmacs shell buffer on OS X, but you get something similar in an iTerm and Terminal window.). In the email unittests, I was getting one unexpected exception that had a non- ascii character in it, but this crashed the unittest harness because when it tried to print the exception message out, you'd instead get an exception in io.py and the test run would exit. Okay, that all makes sense, but IWBNI py3k could do better . Fixing other simple issues (not checked in yet), I'm down to 20 failures, 13 errors out of 247 tests. I'm running test_email_renamed.py only because test_email.py will go away (we should remove the old module names and bump the email pkg version number too). As for the other questions Victor raises, we definitely need to answer them, but that should be for another reply. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRr8eHXEjvBPtnXfVAQIrJgQAoWGaoN82/KFLggu0IIM0BSghIQppiFVv 9weB+Kq6oAcgN95XKGSCZmPwA8jHkeUAWRpm8gZn7k44N2fJuZw11Klajy0tzUPW Y4b5y8jPVU85phOKinynmHb9suXroyb35ZgMSp+WipL4L5PkOMv/x9q59Rs6ldjZ cQu3Sssai9I= =QG9j -----END PGP SIGNATURE----- From janssen at parc.com Sun Aug 12 19:09:26 2007 From: janssen at parc.com (Bill Janssen) Date: Sun, 12 Aug 2007 10:09:26 PDT Subject: [Email-SIG] [Python-3000] fix email module for python 3000 (bytes/str) In-Reply-To: <200708110149.10939.victor.stinner@haypocalc.com> References: <200708090241.08369.victor.stinner@haypocalc.com> <200708110149.10939.victor.stinner@haypocalc.com> Message-ID: <07Aug12.100928pdt."57996"@synergy1.parc.xerox.com> > base64MIME.decode() and base64MIME.encode() should accept bytes and str > base64MIME.decode() result type is bytes > base64MIME.encode() result type should be... bytes or str, no idea > > Other decode() and encode() functions should use same rules about types. Victor, Here's my take on this: base64MIME.decode converts string to bytes base64MIME.encode converts bytes to string Pretty straightforward. Bill From victor.stinner at haypocalc.com Mon Aug 13 02:26:03 2007 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Mon, 13 Aug 2007 02:26:03 +0200 Subject: [Email-SIG] [Python-3000] fix email module for python 3000 (bytes/str) In-Reply-To: <8B640CF2-EB88-45A5-A85F-1267AF24749E@python.org> References: <200708090241.08369.victor.stinner@haypocalc.com> <200708110149.10939.victor.stinner@haypocalc.com> <8B640CF2-EB88-45A5-A85F-1267AF24749E@python.org> Message-ID: <200708130226.03670.victor.stinner@haypocalc.com> On Sunday 12 August 2007 16:50:05 Barry Warsaw wrote: > In r56957 I committed changes to sndhdr.py and imghdr.py so that they > compare what they read out of the files against proper byte > literals. So nobody read my patches? :-( See my emails "[Python-3000] Fix imghdr module for bytes" and "[Python-3000] Fix sndhdr module for bytes" from last saturday. But well, my patches look similar. Barry's patch is incomplete: test_voc() is wrong. I attached a new patch: - fix "h[sbseek] == b'\1'" and "ratecode = ord(h[sbseek+4])" in test_voc() - avoid division by zero - use startswith method: replace h[:2] == b'BM' by h.startswith(b'BM') - use aifc.open() instead of old aifc.openfp() - use ord(b'P') instead of ord('P') Victor Stinner aka haypo http://hachoir.org/ -------------- next part -------------- A non-text attachment was scrubbed... Name: py3k-imgsnd-hdr.patch Type: text/x-diff Size: 5326 bytes Desc: not available Url : http://mail.python.org/pipermail/email-sig/attachments/20070813/081c76b4/attachment.bin From victor.stinner at haypocalc.com Tue Aug 14 04:22:36 2007 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Tue, 14 Aug 2007 04:22:36 +0200 Subject: [Email-SIG] Questions about email bytes/str (python 3000) Message-ID: <200708140422.36818.victor.stinner@haypocalc.com> Hi, After many tests, I'm unable to convert email module to Python 3000. I'm also unable to take decision of the best type for some contents. (1) Email parts should be stored as byte or character string? Related methods: Generator class, Message.get_payload(), Message.as_string(). Let's take an example: multipart (MIME) email with latin-1 and base64 (ascii) sections. Mix latin-1 and ascii => mix bytes. So the best type should be bytes. => bytes (2) Parsing file (raw string): use bytes or str in parsing? The parser use methods related to str like splitlines(), lower(), strip(). But it should be easy to rewrite/avoid these methods. I think that low-level parsing should be done on bytes. At the end, or when we know the charset, we can convert to str. => bytes About base64, I agree with Bill Janssen: - base64MIME.decode converts string to bytes - base64MIME.encode converts bytes to string But decode may accept bytes as input (as base64 modules does): use str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict'). I wrote 4 differents (non-working) patches. So I you want to work on email module and Python 3000, please first contact me. When I will get a better patch, I will submit it. Victor Stinner aka haypo http://hachoir.org/ From barry at python.org Tue Aug 14 15:30:58 2007 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Aug 2007 09:30:58 -0400 Subject: [Email-SIG] [Python-3000] fix email module for python 3000 (bytes/str) In-Reply-To: <200708130226.03670.victor.stinner@haypocalc.com> References: <200708090241.08369.victor.stinner@haypocalc.com> <200708110149.10939.victor.stinner@haypocalc.com> <8B640CF2-EB88-45A5-A85F-1267AF24749E@python.org> <200708130226.03670.victor.stinner@haypocalc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 12, 2007, at 8:26 PM, Victor Stinner wrote: > On Sunday 12 August 2007 16:50:05 Barry Warsaw wrote: >> In r56957 I committed changes to sndhdr.py and imghdr.py so that they >> compare what they read out of the files against proper byte >> literals. > > So nobody read my patches? :-( See my emails "[Python-3000] Fix > imghdr module > for bytes" and "[Python-3000] Fix sndhdr module for bytes" from last > saturday. But well, my patches look similar. Victor, sorry but my email was very spotty and I definitely missed your original patches. Sorry for duplicating work and thanks for fixing the last few things in these modules. Glad Guido got these committed. I'll follow up on email package more in a bit. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRsGuknEjvBPtnXfVAQLbfgQAqfiBeaVwIN35nXn9D7DZXItkzoZSd+1V f/a4PnzBHTdvFZgggisK/7o5b1uULOaHILLSmiQMFp0W/zV2JFCvKI7kc1/SkjSo UgIXK3o9WtmljH3aj1njc6fgy3VCVfa09NDKf89/rCy15AaSxF21YinIDIqF/yGN Sn2RQJqvNPc= =KpZC -----END PGP SIGNATURE----- From barry at python.org Tue Aug 14 17:39:29 2007 From: barry at python.org (Barry Warsaw) Date: Tue, 14 Aug 2007 11:39:29 -0400 Subject: [Email-SIG] [Python-3000] Questions about email bytes/str (python 3000) In-Reply-To: <200708140422.36818.victor.stinner@haypocalc.com> References: <200708140422.36818.victor.stinner@haypocalc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 13, 2007, at 10:22 PM, Victor Stinner wrote: > After many tests, I'm unable to convert email module to Python > 3000. I'm also > unable to take decision of the best type for some contents. I made a lot of progress on the email package while I was traveling, though I haven't checked things in yet. I probably will very soon, even if I haven't yet fixed the last few remaining problems. I'm down to 7 failures, 9 errors of 247 tests. > (1) Email parts should be stored as byte or character string? Strings. Email messages are conceptually strings so I think it makes sense to represent them internally as such. The FeedParser should expect strings and the Generator should output strings. One place where I think bytes should show up would be in decoded payloads, but in that case I really want to make an API change so that .get_payload (decoded=True) is deprecated in favor of a separate method. I'm proposing other API changes to make things work better, a few of which are in my current patch, but others I want to defer if they don't directly contribute to getting these tests to pass. > Related methods: Generator class, Message.get_payload(), > Message.as_string(). > > Let's take an example: multipart (MIME) email with latin-1 and > base64 (ascii) > sections. Mix latin-1 and ascii => mix bytes. So the best type > should be > bytes. > > => bytes Except that by the time they're parsed into an email message, they must be ascii, either encoded as base64 or quoted-printable. We also have to know at that point the charset being used, so I think it makes sense to keep everything as strings. > (2) Parsing file (raw string): use bytes or str in parsing? > > The parser use methods related to str like splitlines(), lower(), > strip(). But > it should be easy to rewrite/avoid these methods. I think that low- > level > parsing should be done on bytes. At the end, or when we know the > charset, we > can convert to str. > > => bytes Maybe, though I'm not totally convinced. It's certainly easier to get the tests to pass if we stick with parsing strings. email.message_from_string() should continue to accept strings, otherwise obviously it would have to be renamed, but also because it's primary use case is turning a triple quoted string literal into an email message. I alluded to the one crufty part of this in a separate thread. In order to accept universal newlines but preserve end-of-line characters, you currently have to open files in binary mode. Then, because my parser works on strings you have to convert those bytes to strings, which I am successfully doing now, but which I suspect is ultimately error prone. I would like to see a flag to preserve line endings on files opened in text + universal newlines mode, and then I think the hack for Parser.parse() would go away. We'd define how files passed to this method must be opened. Besides, I think it is much more common to be parsing strings into email messages anyway. > About base64, I agree with Bill Janssen: > - base64MIME.decode converts string to bytes > - base64MIME.encode converts bytes to string I agree. > But decode may accept bytes as input (as base64 modules does): use > str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict'). Hmm, I'm not sure about this, but I think that .encode() may have to accept strings. > I wrote 4 differents (non-working) patches. So I you want to work > on email > module and Python 3000, please first contact me. When I will get a > better > patch, I will submit it. Like I said, I also have an extensive patch that gets me most of the way there. I don't want to having dueling patches, so I think what I'll do is put a branch in the sandbox and apply my changes there for now. Then we will have real code to discuss. A few other things from my notes and diff: Do we need email.message_from_bytes() and Message.as_bytes()? While I'm (currently ) pretty well convinced that email messages should be strings, the use case for bytes includes reading them directly to or from sockets, though in this case because the RFCs generally require ascii with encodings and charsets clearly described, I think a bytes-to-string wrapper may suffice. Charset class: How do we do conversions from input charset to output charset? This is required by e.g. Japanese to go from euc-jp to iso-2022-jp IIUC. Currently I have to use a crufty string-to-bytes converter like so: >>> bytes(ord(c) for c in s) rather than just bytes(s). I'm sure there's a better way I haven't found yet. Generator._write_headers() and the _is8bitstring() test aren't really appropriate or correct now that everything's a unicode. This affected quite a few tests because long headers that previously were getting split were now not getting split. I ended up ditching the _is8bitstring() test, but that lead me into an API change for Message.__str__() and Message.as_string(), which I've long wanted to do anyway. First Message.__str__() no longer includes the Unix-From header, but more importantly, .as_string() takes the maxheaderlen as an argument and defaults to no header wrapping. By changing various related tests to call .as_string(maxheaderlen=78), these split header tests can be made to pass again. I think these changes make str (some_message) saner and more explicit (because it does not split headers) but these may be controversial in the email-sig. You asked earlier about decode_header(). This should definitely return a list of tuples of (bytes, charset|None). Header is going to need some significant revision First, there's the whole mess of .encode() vs. __str__() vs. __unicode__() to sort out. It's insane that the latter two had different semantics w.r.t. whitespace preservation between encoded words, so let's fix that. Also, if the common use case is to do something like this: >>> msg['subject'] = 'a subject string' then I wonder if we shouldn't be doing more sanity checking on the header value. For example, if the value had a non-ascii character in it, then what should we do? One way would be to throw an exception, requiring the use of something like: >>> msg['subject'] = Header('a \xfc subject', 'utf-8') or we could do the most obvious thing and try to convert to 'ascii' then 'utf-8' if no charset is given explicitly. I thought about always turning headers into Header instances, but I think that might break some common use cases. It might be possible to define equality and other operations on Header instances so that these common cases continue to work. The email-sig can address that later. However, if all Header instances are unicode and have a valid charset, I wonder if the splittable tests are still relevant, and whether we can simplify header splitting. I have to think about this some more. As for the remaining failures and errors, they come down to simplifying the splittable logic, dealing with Message.__str__() vs. Message.__unicode__(), verifying that the UnicodeErrors some tests expect to get raise don't make sense any more, and fixing a couple of other small issues I haven't gotten to yet. I will create a sandbox branch and apply my changes later today so we have something concrete to look at. Cheers, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRsHMsXEjvBPtnXfVAQLfCwP8CeHi9RBW5ULri3w6sBz5a1fkdVCftk71 uW8q0LercTJSa2ewvtrlWdKm9F403IabYjh2Bg8cZfHmYyZ+/b18oU64zzkZylo/ pHw9Iyvk9ZW6G7mwJRwpV9c6JXJNvsQtKRWipuue0ZMagI5OJBXR8vhRIDGkt+NC ARhIrHXPEW8= =DBLp -----END PGP SIGNATURE----- From janssen at parc.com Wed Aug 15 03:44:54 2007 From: janssen at parc.com (Bill Janssen) Date: Tue, 14 Aug 2007 18:44:54 PDT Subject: [Email-SIG] [Python-3000] Questions about email bytes/str (python 3000) In-Reply-To: References: <200708140422.36818.victor.stinner@haypocalc.com> Message-ID: <07Aug14.184454pdt."57996"@synergy1.parc.xerox.com> > > Let's take an example: multipart (MIME) email with latin-1 and > > base64 (ascii) > > sections. Mix latin-1 and ascii => mix bytes. So the best type > > should be > > bytes. > > > > => bytes > > Except that by the time they're parsed into an email message, they > must be ascii, either encoded as base64 or quoted-printable. We also > have to know at that point the charset being used, so I think it > makes sense to keep everything as strings. Actually, Victor's right here -- it makes more sense to treat them as bytes. It's RFC 821 (SMTP) that requires 7-bit ASCII, not the MIME format. Non-SMTP mail transports do exist, and are popular in various places. Email transported via other transport mechanisms may, for instance, use a Content-Transfer-Encoding of "binary" for some sections of the message. Some parts of the top-most header of the message may be counted on to be encoded as ASCII strings, but not the whole message in general. > > About base64, I agree with Bill Janssen: > > - base64MIME.decode converts string to bytes > > - base64MIME.encode converts bytes to string > > I agree. > > > But decode may accept bytes as input (as base64 modules does): use > > str(value, 'ascii', 'ignore') or str(value, 'ascii', 'strict'). > > Hmm, I'm not sure about this, but I think that .encode() may have to > accept strings. Personally, I think it would avoid more errors if it didn't. Let the user explicitly encode the string to a particular representation before calling base64.encode(). Bill From barry at python.org Wed Aug 15 07:50:30 2007 From: barry at python.org (Barry Warsaw) Date: Wed, 15 Aug 2007 01:50:30 -0400 Subject: [Email-SIG] [Python-3000] Questions about email bytes/str (python 3000) In-Reply-To: References: <200708140422.36818.victor.stinner@haypocalc.com> Message-ID: -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 14, 2007, at 11:39 AM, Barry Warsaw wrote: > I will create a sandbox branch and apply my changes later today so > we have something concrete to look at. Done. See: http://svn.python.org/view/sandbox/trunk/emailpkg/5_0-exp/ I'm down to 5 failures and 6 errors (in test_email.py only), and I think most if not all of them are related to the broken header splittable stuff. Please take a look. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRsKUJnEjvBPtnXfVAQISBQQAnEKytL8fqLbe+HADIyIBr1gDFtzbc4nw zY4oEDPV+d4zFiAj9Ap5uePCfQxnqRdBMsHhkbCkB9k0XSDoWv2NxC10KLdE2CEO YMLB+BB5uMjTCkHhaUVr/rIdKv/4LKZFy1v9dJv5X3BF5clugWa3L+tioe0kPk9X jDkjZKc59LE= =73uN -----END PGP SIGNATURE----- From victor.stinner at haypocalc.com Wed Aug 15 21:52:38 2007 From: victor.stinner at haypocalc.com (Victor Stinner) Date: Wed, 15 Aug 2007 21:52:38 +0200 Subject: [Email-SIG] [Python-3000] Questions about email bytes/str (python 3000) In-Reply-To: <07Aug14.184454pdt."57996"@synergy1.parc.xerox.com> References: <200708140422.36818.victor.stinner@haypocalc.com> <07Aug14.184454pdt."57996"@synergy1.parc.xerox.com> Message-ID: <200708152152.38839.victor.stinner@haypocalc.com> On Wednesday 15 August 2007 03:44:54 Bill Janssen wrote: > > (...) I think that base64MIME.encode() may have to accept strings. > > Personally, I think it would avoid more errors if it didn't. Yeah, how can you guess which charset the user want to use? For most user, there is only one charset: latin-1. So I you use UTF-8, he will not understand conversion errors. Another argument: I like bidirectional codec: decode(encode(x)) == x encode(decode(x)) == x So if you mix bytes and str, these relations will be wrong. Victor Stinner aka haypo http://hachoir.org/ From barry at python.org Sun Aug 19 22:19:09 2007 From: barry at python.org (Barry Warsaw) Date: Sun, 19 Aug 2007 16:19:09 -0400 Subject: [Email-SIG] The performance issue of the email package, and how about a cEmail? In-Reply-To: <20070818173801.787E.ICEBERG@21cn.com> References: <20070818173801.787E.ICEBERG@21cn.com> Message-ID: <257CD72B-8935-4239-AF9C-8A3A137910EE@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 18, 2007, at 6:11 AM, Iceberg wrote: > It seems the only way to boost up the performance is to rewrite the > key part (mostly FeedParser.py) in C. And I believe the effort will > be worthy for such a fundamental, widespread, core package. In > fact, we saw similar happen, such as: cString, cPickle, cProfile. > > I read old mail archive (http://mail.python.org/pipermail/email- > sig/) since 2005, but found no thread on this topic. So, I would > venture to ask, is there any plan for a cEmail package in near future? There's no plans by me to do this, but if yo're interested, I think it could be a worth goal. Without looking at those existing packages, there's two things I'd say. I doubt that either package would be included in Python by default, either because it's C++ or because of a license incompatibility. OTOH, it may or may not be worth enabling optional building of a cFeedParser based on whether these packages are available or not. OTOH, it might be nice to provide something like a cFeedParser as a third-party egg, and if it works out, and is enough of a performance boost, I'd probably support extending the email package to use it if it's available. Cheers, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRsilvnEjvBPtnXfVAQJDAwP9HBWew9kpf8IS8DM90/cnoWr8udblxcGi W6YgLcG8JR9B22aUThC8t/5wMuu3mBZhgouyPgNCUK/j4kL1zMC33zYoinGzzrke F4f9ZXQ8Z1eG+GreDGhjxD6psrcpDAj+/04XtyL1tr7FE5GWcEN90f9InhzFGbQF Uu3PPLIZ9n0= =NPq5 -----END PGP SIGNATURE----- From barry at python.org Wed Aug 22 00:12:40 2007 From: barry at python.org (Barry Warsaw) Date: Tue, 21 Aug 2007 18:12:40 -0400 Subject: [Email-SIG] [Python-3000] Py3k Sprint Tasks (Google Docs & Spreadsheets) In-Reply-To: References: Message-ID: <93DBB66F-5D0D-4E46-8480-D2BFC693722A@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 21, 2007, at 1:56 PM, gvanrossum at gmail.com wrote: > I've shared a document with you called "Py3k Sprint Tasks": > http://spreadsheets.google.com/ccc? > key=pBLWM8elhFAmKbrhhh0ApQA&inv=python-3000 at python.org&t=3328567089265 > 242420&guest > > It's not an attachment -- it's stored online at Google Docs & > Spreadsheets. To open this document, just click the link above. > > (resend, I'm not sure this made it out the first time) > > This spreadsheet is where I'm organizing the tasks for the Google > Sprint starting tomorrow. > > Feel free to add. If you're coming to the sprint, feel free to claim > ownership of a task. I have approval to spend some official time at this sprint, though I'll be working from home and will be on IRC, Skype, etc. I've been spending hours of my own time on the email package for py3k this week and every time I think I'm nearing success I get defeated again. I think Victor Stinner came to similar conclusions. To put it mildly, the email package is effed up! But I'm determined to solve the worst of the problems this week. I only have Wednesday and Thursday to work on this, with most of my time available on Thursday. I'd really like to find one or two other folks to connect with to help work out the stickiest issues. Please contact me directly or on this list to arrange a time with me. I'm UTC-4 if that helps. I'll be on #python-dev (barry) too. Remember that the current code is in the python sandbox (under emailpkg/5_0-exp). I have some uncommitted code which I'll try to check in tonight, though I don't know if it will make matters better or worse. ;) Cheers, - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRstjWXEjvBPtnXfVAQLQcQP+Lo/D1YH1+w/51kNyQN1+zrzu1Cov7ERk 1xtT5L2LlaPjXGeVMlc6Xz0bbLVc96kSQ4SIrkc5RRNorcYzMf8kID4rLkO6S+kU CXtpOVgmzkX9zotAL9O72v2uOHT6c0fcK8ag44EiAtWei3Tdf+R2rL6lOzo0lHgj qmVPFzlzGCA= =t1nr -----END PGP SIGNATURE----- From stephen at xemacs.org Sat Aug 25 08:10:04 2007 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Sat, 25 Aug 2007 15:10:04 +0900 Subject: [Email-SIG] [Python-3000] Py3k Sprint Tasks (Google Docs & Spreadsheets) In-Reply-To: <93DBB66F-5D0D-4E46-8480-D2BFC693722A@python.org> References: <93DBB66F-5D0D-4E46-8480-D2BFC693722A@python.org> Message-ID: <87y7g0401v.fsf@uwakimon.sk.tsukuba.ac.jp> Barry Warsaw writes: > I've been spending hours of my own time on the email package for py3k > this week and every time I think I'm nearing success I get defeated > again. I'm ankle deep in the Big Muddy (daughter tested positive for TB as expected -- the Japanese innoculate all children against it because of the sins of their fathers -- and school starts on Tuesday, so we need to make a bunch of extra trips to doctors and whatnot), so what thin hope I had of hanging out with the big boys at the Python-3000 sprint long since evaporated. However, starting next week I should have a day a week or so I can devote to email stuff -- if you want to send any thoughts or requisitions my way (or an URL to sprint IRC transcripts), I'd love to help. Of course you'll get it all done and leave none for me, right? > But I'm determined to solve the worst of the problems this week. Bu-wha-ha-ha! Steve From barry at python.org Sun Aug 26 20:30:47 2007 From: barry at python.org (Barry Warsaw) Date: Sun, 26 Aug 2007 14:30:47 -0400 Subject: [Email-SIG] [Python-3000] Py3k Sprint Tasks (Google Docs & Spreadsheets) In-Reply-To: <87y7g0401v.fsf@uwakimon.sk.tsukuba.ac.jp> References: <93DBB66F-5D0D-4E46-8480-D2BFC693722A@python.org> <87y7g0401v.fsf@uwakimon.sk.tsukuba.ac.jp> Message-ID: <9CBCCF2F-B428-4D37-8C18-1EAFB86CD7D9@python.org> -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On Aug 25, 2007, at 2:10 AM, Stephen J. Turnbull wrote: > Barry Warsaw writes: > >> I've been spending hours of my own time on the email package for py3k >> this week and every time I think I'm nearing success I get defeated >> again. > > I'm ankle deep in the Big Muddy (daughter tested positive for TB as > expected -- the Japanese innoculate all children against it because of > the sins of their fathers -- and school starts on Tuesday, so we need > to make a bunch of extra trips to doctors and whatnot), so what thin > hope I had of hanging out with the big boys at the Python-3000 sprint > long since evaporated. Stephen, sorry to hear about your daughter and I hope she's going to be okay of course! > However, starting next week I should have a day a week or so I can > devote to email stuff -- if you want to send any thoughts or > requisitions my way (or an URL to sprint IRC transcripts), I'd love to > help. Of course you'll get it all done and leave none for me, right? Unfortunately, we didn't really sprint much on it, but I did get a chance to spend time on the branch. I think I see the light at the end of the tunnel for getting the existing tests to pass, though I haven't even looked at test_email_codecs.py yet. Because of the way things are going to work with in put and output codecs, I'll definitely want to get some sanity checks with Asian codecs. I'll try to put together a list of issues and questions and get those sent out next week. >> But I'm determined to solve the worst of the problems this week. > > Bu-wha-ha-ha! Heh, well I'm getting closer. We're definitely going to have some API changes, so I'll outline those as well. - -Barry -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (Darwin) iQCVAwUBRtHG13EjvBPtnXfVAQKCngP+PUTm82FjnVpqz7HvPLS/zPXBMelDNhkK AKGIk5hveka180QEbA/DMsu7LZmPK2jXOQJWxufRsLfuzwKL3WtDF1IIyiICkC/I HoR04bHZJzUdEzZuZPL53I704JoO8QBpXEOn/JdauFEaZ6qakueLdnqx1Ab0LbSP RCLiVh9BxtU= =6Ngh -----END PGP SIGNATURE----- From stephen at xemacs.org Tue Aug 28 05:36:56 2007 From: stephen at xemacs.org (Stephen J. Turnbull) Date: Tue, 28 Aug 2007 12:36:56 +0900 Subject: [Email-SIG] [Python-3000] Py3k Sprint Tasks (Google Docs & Spreadsheets) In-Reply-To: <9CBCCF2F-B428-4D37-8C18-1EAFB86CD7D9@python.org> References: <93DBB66F-5D0D-4E46-8480-D2BFC693722A@python.org> <87y7g0401v.fsf@uwakimon.sk.tsukuba.ac.jp> <9CBCCF2F-B428-4D37-8C18-1EAFB86CD7D9@python.org> Message-ID: <87tzqkz5wn.fsf@uwakimon.sk.tsukuba.ac.jp> Barry Warsaw writes: > Stephen, sorry to hear about your daughter and I hope she's going to > be okay of course! Oh, she's *fine*. There's just a conflict between the Japanese practice of vaccinating all school children against TB, and the U.S. practice of testing for TB antibodies. About 1 in 3 kids coming from Japan to U.S. schools get snagged. Annoying, but I'll trade this for the problems with visas and the like that colleagues have had *any* day. > haven't even looked at test_email_codecs.py yet. Because of the way > things are going to work with in put and output codecs, I'll > definitely want to get some sanity checks with Asian codecs. OK, *that* I can help with!