parsing email from stdin

Antoon Pardon antoon.pardon at rece.vub.ac.be
Tue Oct 8 08:20:20 EDT 2013


I want to do some postprocessing on messages from a particular mailbox.
So I use getmail which will fetch the messages and feed them to stdin
of my program.

As I don't know what encoding these messages will be in, I thought it
would be prudent to read stdin as binary data.

Using python 3.3 on a debian box I have the following code.

#!/usr/bin/python3

import sys
from email import message_from_file

sys.stdin = sys.stdin.detach()
msg = message_from_file(sys.stdin)

which gives me the following trace back

   File "/home/apardon/.getmail/verdeler", line 7, in <module>
     msg = message_from_file(sys.stdin)
   File "/usr/lib/python3.3/email/__init__.py", line 56, in message_from_file
     return Parser(*args, **kws).parse(fp)
   File "/usr/lib/python3.3/email/parser.py", line 58, in parse
     feedparser.feed(data)
   File "/usr/lib/python3.3/email/feedparser.py", line 167, in feed
     self._input.push(data)
   File "/usr/lib/python3.3/email/feedparser.py", line 100, in push
     data, self._partial = self._partial + data, ''
TypeError: Can't convert 'bytes' object to str implicitly))

which seems to be rather odd. The following header are in the msg:

Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 8bit

So why doesn't the email parser lookup the charset and use that
for converting to string type?

What is the canonical way to parse an email message from stdin?

-- 
Antoon Pardon




More information about the Python-list mailing list