The lib email parse problem...
John Machin
sjmachin at lexicon.net
Tue Aug 29 06:02:46 EDT 2006
叮叮当当 wrote:
> this is not enough.
>
> when a part is mulitpart/alternative, i must find out which sub part i
> need, not all the subparts. so i must know when the alternative is
> ended.
>
So you'll have to write your own tree-walker. It would seem that
is_multipart(), get_content_type() and get_payload() are the important
methods.
Here's a quickly lashed-up example:
def choose_one(part, html_ok=False):
last = None
for subpart in part.get_payload():
if html_ok or "html" not in subpart.get_content_type():
last = subpart
return last
def traverse(part, html_ok=False):
mp = part.is_multipart()
ty = part.get_content_type()
print "multi:%r type:%r file:%r" % (mp, ty,
part.get_filename("<<NoFileName>>"))
if mp:
if ty == "multipart/alternative":
chosen = choose_one(part, html_ok=html_ok)
traverse(chosen, html_ok=html_ok)
else:
for subpart in part.get_payload():
traverse(subpart, html_ok=html_ok)
import email
pmsg = email.message_from_string(msg_text)
for toggle in (True, False):
print "--- html_ok is %r ---" % toggle
traverse(pmsg, html_ok=toggle)
With a suitable message, this produced:
--- html_ok is True ---
multi:True type:'multipart/alternative' file:'<<NoFileName>>'
multi:False type:'text/html' file:'<<NoFileName>>'
--- html_ok is False ---
multi:True type:'multipart/alternative' file:'<<NoFileName>>'
multi:False type:'text/plain' file:'<<NoFileName>>'
More information about the Python-list
mailing list