[ python-Bugs-531205 ] Bugs in rfc822.parseaddr()

SourceForge.net noreply at sourceforge.net
Thu Nov 25 02:29:41 CET 2004


Bugs item #531205, was opened at 2002-03-18 02:13
Message generated for change (Comment added) made by facundobatista
You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=531205&group_id=5470

Category: Python Library
>Group: Python 2.4
Status: Open
Resolution: None
Priority: 5
Submitted By: Barry A. Warsaw (bwarsaw)
Assigned to: Ben Gertzfield (che_fox)
Summary: Bugs in rfc822.parseaddr()

Initial Comment:
This bug is in rfc822.parseaddr(), and thus inherited
into email.Utils.parseaddr() since the latter does a
straight include of the former.  It has a nasty bug
when the email address contains embedded spaces: it
collapses the spaces:

>>> from email.Utils import parseaddr
>>> parseaddr('foo bar at wooz.org')
('', 'foobar at wooz.org')
>>> parseaddr('<foo bar at wooz.org>')
('', 'foobar at wooz.org')

Boo, hiss.  Of course parseaddr() would be more
involved to implement in an RFC 2822 compliant way, but
it would be very cool.

Note that I'm reporting this bug here instead of the
mimelib project because it's actually in rfc822.py. 
Once solution might include fixing it in the email
package only.




----------------------------------------------------------------------

>Comment By: Facundo Batista (facundobatista)
Date: 2004-11-24 22:29

Message:
Logged In: YES 
user_id=752496

Reassigning it as a Py2.4 bug.

----------------------------------------------------------------------

Comment By: Paul Moore (pmoore)
Date: 2004-11-08 17:49

Message:
Logged In: YES 
user_id=113328

This issue still exists in Python 2.3.4 and Python 2.4b2.

----------------------------------------------------------------------

Comment By: Johannes Gijsbers (jlgijsbers)
Date: 2004-07-22 15:30

Message:
Logged In: YES 
user_id=469548

Well, the docs say "unless the parse fails, in which case a
2-tuple of ('', '') is returned". I think it's reasonable to
say that non-compliant addresses like this should fail to
parse and thus that parseaddr('foo bar at wooz.org') should
returns ('', '')

----------------------------------------------------------------------

Comment By: Tim Roberts (timroberts)
Date: 2002-08-12 18:40

Message:
Logged In: YES 
user_id=265762

Interesting to note that RFC 822 (but not 2822) allows spaces 
around any periods in the address without quoting (2822 does 
allow spaces around the @), and those spaces are to be 
removed.  Section A.1.4 gives the example 
   Wilt  .  Chamberlain at NBA.US
and says it should be parsed as "Wilt.Chamberlain".

Given that, it's hard for me to see that the current behavior 
should be changed at all, since there is no correct way to 
parse this non-compliant address.

----------------------------------------------------------------------

Comment By: Barry A. Warsaw (bwarsaw)
Date: 2002-04-15 14:18

Message:
Logged In: YES 
user_id=12800

Note further that "foo bar"@wooz.org is properly parsed. 
The question is, what should parseaddr() do in this
non-compliant situation?  I can think of a couple of things:

- it could raise an exception
- it could return ('', 'bar at wooz.org')
- it could return ('foo', 'bar at wooz.org')
- it could return ('' '"foo bar"@wooz.org')

I'm not sure what the right thing to do is.  I'm assigning
to Ben Gertzfield to get his opinion.  Ben, feel free to add
a comment and re-assign the bug to me.

----------------------------------------------------------------------

You can respond by visiting: 
https://sourceforge.net/tracker/?func=detail&atid=105470&aid=531205&group_id=5470


More information about the Python-bugs-list mailing list