Mail extraction problem (something's wrong with split methods)
Pierre Fortin
pfortin at pfortin.com
Sun Sep 12 13:37:12 EDT 2004
On Sun, 12 Sep 2004 17:32:15 +0200 Luka wrote:
This msg has already been processed by something that appears to generate
list/tuple segments... I would suspect that whatever modified the message
has a string size limitation... However, it looks like whatever
manhandled this msg just did what looks like a python print of a tuple...
If you really want to process this type of message instead of getting at
the real problem, then here's a clue...
Here, I reduced the contents to just the items...
('+OK',
['Received', # brackets, braces, parens are just text herein
'by',
'for',
'Date',
'Message-Id',
'From',
'To',
'Subject',
'X-Scanned-By: MIMEDefang 2.42',
'X-Virus-Scanned',
'Content-Length: 4210',
'Status: ',
'',
'',
'---Code block---',
'[6964, 7086, ..., 6730', # "[" is just text here
', 6793, ..., 5534]', # "]" ditto
'---Code block---'
],
4815
)
Further reducing the items shows the structure:
('s',
['s', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', 's', # headers
'', # header/body separator
'',
's', # ---Code block---
's', # 1st part
's', # 2nd part
's' # ---Code block---
],
4815 # reported msg size
)
which boils down to:
(s,[s, ..., s],i) # aka: tuple(string,list(strings),int)
So... looks like you just need to isolate the strings between the
"---Code block---" strings (could be more than 2 or just 1) and
concatenate them. splitting the result...
Straight-line brute forcing it:
msg = .... # get the message as a tuple
sep = "---Code block---"
start = msg[1].index(sep)
data = msg[1][start+1:]
end = data.index(sep)
data = data[:end]
print "".join(data)[1:-1].split(", ")
> This is the original mail, sorry because of the size. As you can see,
> there are two problematic spots: 6730', ', and ','6573, at the end of
> the mail.
Pierre
More information about the Python-list
mailing list