Possible bug in Email and Multifile Modules ?

Sheila King usenet at thinkspot.net
Mon Feb 4 04:01:43 EST 2002


Some friends and I are working on a script that filters incoming email. We
ran across some difficulties today with a certain test message that a
friend is using to test the script.

The original script is using mimetools and multifile to parse MIME
multipart messages (some parts of it were written when we only had access
to 1.5.2 on our web host).

The pertinent parts of the script are as follows (note--I didn't write this
code...written by a "python newbie" friend of mine):

============(begin code)==================
import sys, os
import mimetools, multifile

header = mimetools.Message(sys.stdin,0)
msgtype = header.gettype()

if msgtype[:10] == "multipart/":
	file = multifile.MultiFile(sys.stdin,0)
	file.push(header.getparam("boundary"))
	while file.next():
		submsg = mimetools.Message(file)
		nm = submsg.getparam("name")
		if nm <> None:
			nm = nm.lower()

			if nm in bannedfiles:
				if append_status == "on":
					msg_bannedfiles = msg_bannedfiles + msg_break +
"(Reason: " + quote + nm + quote + " file not permitted.)"
				print msg_bannedfiles
				sys.exit(exitcode)

			if bannedextensions:
				tempvar = nm.split(".")
				lastindex = len(tempvar) - 1
				ext = tempvar[lastindex]
				if ext in bannedextensions:
					if append_status == "on":
						msg_bannedextensions = msg_bannedextensions +
msg_break + "(Reason: " + quote + ext + quote + " files not permitted.)"
					print msg_bannedextensions
					sys.exit(exitcode)

  		fnm = submsg.getparam("filename")
		if fnm <> None:
			fnm = fnm.lower()
			if fnm in bannedfiles:
				if append_status == "on":
					msg_bannedfiles = msg_bannedfiles + msg_break +
"(Reason: " + quote + nm + quote + " file not permitted.)"
				print msg_bannedfiles
				sys.exit(exitcode)

			tempvar = fnm.split(".")
			lastindex = len(tempvar) - 1
			ext = tempvar[lastindex]
			if ext in bannedextensions:
				if append_status == "on":
					msg_bannedextensions = msg_bannedextensions + msg_break
+ "(Reason: " + quote + ext + quote + " files not permitted.)"
				print msg_bannedextensions
				sys.exit(exitcode)
	file.pop()
============(end code)==================

Someone was testing the code out today with a message containing a known
virus and reported the following error results:

> btw, I think I found one of the bugs in Python2.2. The script bombs 
> out when I feed it an email that contains the Sircam virus. The error is;
> multifile.Error: sudden EOF in MultiFile.readline()
> When I switch back to Python1.5.2, the problem doesn't exist. 

On the chance the I might fare better with the new email module, I got a
copy of the email message from him and tried the following in an
interactive session, with results as shown:

=============(interpreter session)====================
>>> msg = email.message_from_file(open('virus.txt'))
Traceback (most recent call last):
  File "<interactive input>", line 1, in ?
  File "E:\PYTHON\PYTHON22\lib\email\__init__.py", line 35, in
message_from_file
    return _Parser(_class).parse(fp)
  File "E:\PYTHON\PYTHON22\lib\email\Parser.py", line 40, in parse
    self._parsebody(root, fp)
  File "E:\PYTHON\PYTHON22\lib\email\Parser.py", line 116, in _parsebody
    raise Errors.BoundaryError(
BoundaryError: Couldn't find terminating boundary:
====_ABC1234567890DEF_====
=============(end interpreter session)================

Now, it turns out that this message is not well formed. Instead of the
correct closing boundary at the end of the message, it has a boundary which
indicates that another part should follow.

I can see no way to read this message in using the email module. This makes
the email module something that I would be unable to use to scan incoming
mail.

I reported my findings to this other tester:

>  You will see that the final boundary is missing the two closing
>  hyphens. Therefore, it looks like there should be another message
>  part, but there isn't one.
> 
> Is it possible that Norton removed the last part or something? Are you
> 100% sure that this is the intact message as you received it? If so,
> then I guess it isn't a correctly formed message.
> 
> Still, this seems to me to be a problem with the modules in Python. I
> will report it in the newsgroup and see what they say.
> 
> I was able to import the message you sent into Agent and Agent did
> show it as having two attachments, the second one being named
> stuff.MP3.pif.

He responded:

>  No parts were removed, it's 100% intact. And you're right, it's not
>  well-formed, but that's the whole idea behind it. It tricks Outlook
>  into executing the attachment.
> But not being well-formed shouldn't break anything...
> 


Curious to hear what others, more experienced than I, think about this. I
can provide a copy of the virus message and the complete code for the
current script upon request.

-- 
Sheila King
http://www.thinkspot.net/sheila/

"When introducing your puppy to an adult cat,
restrain the puppy, not the cat." -- Gwen Bailey,
_The Perfect Puppy: How to Raise a Well-behaved Dog_






More information about the Python-list mailing list