Searching through two logfiles in parallel?
Oscar Benjamin
oscar.j.benjamin at gmail.com
Tue Jan 8 18:40:01 EST 2013
On 8 January 2013 19:16, darnold <darnold992000 at yahoo.com> wrote:
> i don't think in iterators (yet), so this is a bit wordy.
> same basic idea, though: for each message (set of parameters), build a
> list of transactions consisting of matching send/receive times.
The advantage of an iterator based solution is that we can avoid
loading all of both log files into memory.
[SNIP]
>
> results = {}
>
> for line in sendData.split('\n'):
> if not line.strip():
> continue
>
> timestamp, params = parse(line)
> if params not in results:
> results[params] = [{'sendTime': timestamp, 'receiveTime':
> None}]
> else:
> results[params].append({'sendTime': timestamp, 'receiveTime':
> None})
[SNIP]
This kind of logic is made a little easier (and more efficient) if you
use a collections.defaultdict instead of a dict since it saves needing
to check if the key is in the dict yet. Example:
>>> import collections
>>> results = collections.defaultdict(list)
>>> results
defaultdict(<type 'list'>, {})
>>> results['asd'].append(1)
>>> results
defaultdict(<type 'list'>, {'asd': [1]})
>>> results['asd'].append(2)
>>> results
defaultdict(<type 'list'>, {'asd': [1, 2]})
>>> results['qwe'].append(3)
>>> results
defaultdict(<type 'list'>, {'qwe': [3], 'asd': [1, 2]})
Oscar
More information about the Python-list
mailing list