Handling emails

TheSaint nobody at nowhere.net.no
Sun Jun 12 09:57:38 EDT 2011


Steven D'Aprano wrote:

First of all: thanks for the reply

>> header =_pop.top(nmuid, 0)
 
> To parse emails, you should use the email package. It already handles
> bytes and strings.
I've read several information this afternoon, mostly are leading to errors. 
That could be my ignorance fault :)
For what I could come over, I decided to write my own code.

def msg_parser(listOfBytes):
    header={}
    for lin in listOfBytes:
        try: line= lin.decode()
        except UnicodeDecodeError:
            continue
        for key in _FULLhdr:
            if key in line:
                header[key]= line
                continue
    return header

listOfBytes is the header content, whuch id given by 
libpop.POP3.top(num_msg. how_much), tuple second part.

However, some line will fail to decode correctly. I can't imagine why emails 
don't comply to a standard.

> Other than that, I'm not entirely sure I understand your problem. In
> general, if you have some bytes, you can decode it into a string by hand:

I see. I didn't learn a good english yet :P. I'm Italian :)
 
>>>> header = b'To: python-list at python.org\n'
>>>> s = header.decode('ascii')
>>>> s
> 'To: python-list at python.org\n'

I know this, in case to post the entire massege header and envelope it's not 
applicable.
The libraries handling emails and their headers seems to me a big confusion 
and I suppose I should take a different smaller approach.

I'll try to show a header (if content isn't privacy breaker) but as the 
above example the *_pop.top(nmuid, 0)* won't go into your example

> If this is not what you mean, perhaps you should give an example of what
> header looks like

The difference is that previous version returning text strings and the 
following processes are based on strings manipulations.
Just to mention, my program reads headers from POP3 or IMAP4 server and 
apply some regex filtering in order to remove unwanted emails from the 
server. All the filters treating IO as ascii string of characters.
 
I passed my modules to 2to3 for the conversion to the newer python, but at 
the first run it told that downloaded header is not a string.

-- 
goto /dev/null



More information about the Python-list mailing list