[Distutils] pkginfo python 3 port

Tres Seaver tseaver at palladion.com
Fri May 28 19:54:35 CEST 2010


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Barry Warsaw wrote:
> On May 27, 2010, at 08:59 PM, Tres Seaver wrote:
> 
>> Barry Warsaw wrote:
>>> On May 27, 2010, at 10:25 AM, Sridhar Ratnakumar wrote:
>>>
>>>> Is there a way to parse a RFC 822 message in Python 3?
>>> If it's ASCII, you should have no problems using email.parser.Parser.
>> The issue is that its behavior is subtly different from the now-removed
>> rfc822 parser::
>>
>>  $ /opt/Python-2.6.5/bin/python
>>  Python 2.6.5 (r265:79063, Apr  6 2010, 14:45:18)
>>  [GCC 4.3.3] on linux2
>>  Type "help", "copyright", "credits" or "license" for more information.
>>  >>> from StringIO import StringIO
>>  >>> with_multiline = StringIO("""\
>>  ... Description:  this is a multiline RFC 822
>>  ...               header.""")
>>  >>> from rfc822 import Message
>>  >>> rfc_msg = Message(with_multiline)
>>  >>> with_multiline.seek(0)
>>  >>> from email.parser import Parser
>>  >>> email_msg = Parser().parse(with_multiline)
>>  >>> rfc_msg.getheader('Description')
>>  'this is a multiline RFC 822\n header.'
>>  >>> email_msg.get('Description')
>>  'this is a multiline RFC 822\n              header.'
> 
> If I'm reading this correctly, the "problem" is that rfc822 collapses
> continuation whitespace and email.parser preserves it?  Isn't the email
> package (more) correct,

"More correct" is debateable:  The email.parser module does not remove
the newline, for instance, which is what RFC2822 suggests for
"unfolding" header lines:

 http://www.faqs.org/rfcs/rfc2822.html

Collapsing extra leading whitespace in header continuation lines seems
like a reasonable strategy:  lines created by "folding" per RFC (2)822
won't normally have them, while those which do (e.g, as created by
distutils, or perhaps by hand) do, but they aren't meaningful.

> and what specific problem does that cause?  Or is it
> just that it's different so tools have to catch up to that?

In particular, pkginfo wants to run across a wide range of Python
versions, with Python 2.4 still actively supported.  I therefore need to
fall back to the rfc822 module when the newer module is not present.  I
have chosen for the moment to enforce the collapsing where email.parser
is used.


Tres.
- --
===================================================================
Tres Seaver          +1 540-429-0999          tseaver at palladion.com
Palladion Software   "Excellence by Design"    http://palladion.com
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iEYEARECAAYFAkwAA1YACgkQ+gerLs4ltQ6ZRwCeOuwr3bq/h6BJqWWNfUB+qygB
/fQAoI5daS3qA/yYiEF6s+PsDaGPcxOn
=KN6V
-----END PGP SIGNATURE-----



More information about the Distutils-SIG mailing list