[Spambayes] illegal header tanks SpamBayes

Tim Peters tim.one at comcast.net
Sat Jul 12 01:45:57 EDT 2003


OK, I see what's going wrong in Bobby's email.  Unfortunately, Barry is on
vacation for the next week, and I'm not sure what to do about it.

The problem occurs at the transition point into the second section of the
mutipart/alternative; here are the repr's of the relevant lines:

    '--= Multipart Boundary 2076846-- \r\n'
    '\r\n'
    '\n'
    'Congratulations!  You have been selected to receive a \r\n'

The relevant part of the main headers is here:

    'Content-Type: multipart/alternative;\r\n'
    '        boundary="= Multipart Boundary 2076846" \r\n'

The entire HMTL part of the message is treated as preamble, because that
part is missing a boundary marker.

The text part of the msg seems also ill-formed, because of the trailing '--
' after the boundary tag.  The msg as a whole is ill-formed for another
reason:  it's missing a trailing boundary tag too.

I can't figure out what this code in _parsebody() thinks it's doing, and
it's the cause of the ultimate exception:

            # Find out what kind of line endings we're using
            start += len(mo.group('sep')) + len(mo.group('ws'))
            mo = NLCRE.search(payload, start)
            if mo:
                start += len(mo.group(0))

At this point, start is between the second '6' and the third '-' in

    '--= Multipart Boundary 2076846-- \r\n'
                                   ^

so

            mo = NLCRE.search(payload, start)

finds the \r\n at the end of

    '--= Multipart Boundary 2076846-- \r\n'
                                      ^

and then

                start += len(mo.group(0))

sets start just before the blank at the third-last character of

    '--= Multipart Boundary 2076846-- \r\n'\
                                     ^

This logic simply makes no sense to me.  The *effect* is to skip over the
last two dashes, treating the MIME section headers as starting with the

    ' \r\n'

from the tail end of the boundary line, and the leading blank there is what
raises the ultimate

    'Continuation line seen before first header'

exception.

If the *intent* of

                start += len(mo.group(0))

is to move start just beyond the end of the line with the boundary, then a
correct way to spell that is

                start = mo.end(0)

(and ditto for the earlier

            start += len(mo.group('sep')) + len(mo.group('ws'))

in this function).  But I don't know the intent, and the comment (above)
doesn't seem to match the code either.  If the indicated code is changed to

                start = mo.end(0)

then this msg parses without error (in non-strict mode).
-------------- next part --------------
x= 'X-MS-Mail-Gibberish: Microsoft Mail Internet Headers Version 2.0\r\nReceived: from memex2.harrahs.org ([10.3.5.44]) by entcmail1.harrahs.org with Microsoft SMTPSVC(5.0.2195.5329);\r\n\t Fri, 11 Jul 2003 11:59:52 -0500\r\nReceived: from mail6.aswediscussed.com ([64.253.204.211]) by memex2.harrahs.org with Microsoft SMTPSVC(5.0.2195.5329);\r\n\t Fri, 11 Jul 2003 11:59:54 -0500\r\nReceived: from aswediscussed.com (192.168.1.8)\r\n  by mail6.aswediscussed.com with SMTP; 11 Jul 2003 12:59:53 -0400\r\nFrom: As We Discussed<584-2076846-unsubscribe at aswediscussed.com>\r\nTo: hwilkins at harrahs.com \r\nSubject: Get an Unsecured Platinum Card. No Credit Checks\r\nDate: Fri, 11 Jul 2003 10:56:27 -0400\r\nContent-Type: multipart/alternative;\r\n        boundary="= Multipart Boundary 2076846" \r\nX-Priority: 3 \r\nX-MSMail-Priority: Normal \r\nX-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)  \r\nX-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1106  \r\nMime-Version: 1.0 \r\nReturn-Path: 584-2076846-unsubscibe at aswediscussed.com\r\nMessage-ID: <MEMEX2K5RVTZEnWiytX0000a463 at memex2.harrahs.org>\r\nX-OriginalArrivalTime: 11 Jul 2003 16:59:54.0239 (UTC) FILETIME=[D9AD58F0:01C347CD]\r\n\n<HTML>\r\n<HEAD>\r\n<TITLE></TITLE>\r\n\r\n</HEAD>\r\n<BODY BGCOLOR=#FFFFFF>\r\n<table width=575 border=0 cellpadding=0 cellspacing=0 align="center">\r\n  <tr> \r\n    <td colspan=3> <a href="http://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20="><img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat_email7x_01.gif" width=575 height=91 alt="" border="0"></a></td>\r\n  </tr>\r\n  <tr> \r\n    <td colspan=3> <a href="http://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20="><img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat_email7x_02.gif" width=575 height=197 alt="" border="0"></a></td>\r\n  </tr>\r\n  <tr> \r\n    <td colspan=3> <a href="http://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20="><img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat7_text_03.gif" width=575 height=104 alt="" border="0"></a></td>\r\n  </tr>\r\n  <tr> \r\n    <td rowspan=2> <a href="http://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20="><img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat_email7x_04.gif" width=286 height=99 alt="" border="0"></a></td>\r\n    <td> <img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat_email7_arrow.gif" width=115 height=47 alt=""></td>\r\n    <td rowspan=2> <a href="http://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20="><img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat_email7x_06.gif" width=174 height=99 alt="" border="0"></a></td>\r\n  </tr>\r\n  <tr> \r\n    <td> <a href="http://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20="><img src="http://freeimagehosting.sendmebargains.com/platinum/usaplat_email7x_07.gif" width=115 height=52 alt="" border="0"></a></td>\r\n  </tr>\r\n</table>\r\n\r\n\r\n\r\n<img src="http://www.aswediscussed.com/o/o584o.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20=" width="1" height="1" alt="" border="0">\r\n&nbsp;<br>&nbsp;<br><center><table bgcolor="#c0c0c0" width=575 border=0 cellpadding="1" cellspacing="2"><tr><td bgcolor="#efefef" align=center><font face=verdana size=1 color=black>To stop receiving offers <a href=\'http://www.aswediscussed.com/unsubscribe/?ea=hwilkins@harrahs.com\'><font face=verdana size=1 color=black>Go Here</font></a><font face=verdana size=1 color=black> or send mail to:<br>Unsubscribe Department<br>1730 S. Federal Hwy, Suite 116<br>Delray Beach, FL 33483</font></td></tr></table></center>\r\n</body>\r\n</html>\r\n\r\n--= Multipart Boundary 2076846-- \r\n\r\n\nCongratulations!  You have been selected to receive a \r\n$7500 unsecured Platinum Credit Card \r\nfrom USA Platinum! \r\n\r\nYour approval is guaranteed*.  \r\nSimply click on the link below to complete \r\nthe application\r\n\r\nhttp://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20=\r\n\r\nThis offer is valid even if you\'ve had past credit problems or \r\neven no credit history.  Now you \r\ncan receive a $7,500 unsecured \r\nPlatinum Credit Card that can help build your credit.  And to help \r\nget your card to you sooner, we have been authorized to waive any \r\nemployment or credit verification.\r\n\r\nhttp://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20=\r\n\r\nThat\'s right, now you can enjoy great merchandise while establishing \r\nyour credit because the \r\nUSA Platinum Credit Card reports your new \r\ncredit to the major credit bureaus. We can help you \r\nestablish your \r\ncredit while you purchase the items you want to have today! \r\n\r\nhttp://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20=\r\n\r\nYour approval is guaranteed*! Act now and claim your unsecured USA \r\nPlatinum Credit Card with a \r\nstarting $7500 credit limit today. \r\n\r\nhttp://www.aswediscussed.com/c/c584c.php?ea=aHdpbGtpbnNAaGFycmFocy5jb20=\r\n\r\nSincerely,\r\n\r\nYour New Offers Department\r\n\r\n(*see web site for qualifications)\r\n\r\n\r\nStop receiving offers:\r\nhttp://www.aswediscussed.com/unsubscribe/?ea=hwilkins@harrahs.com\r\n\r\n'

#for y in x.splitlines(True):
#    print repr(y)

import email
msg = email.message_from_string(x)


More information about the Spambayes mailing list