[Tutor] Logfile Manipulation
Stephen Nelson-Smith
sanelson at gmail.com
Mon Nov 9 17:10:44 CET 2009
On Mon, Nov 9, 2009 at 3:15 PM, Wayne Werner <waynejwerner at gmail.com> wrote:
> On Mon, Nov 9, 2009 at 7:46 AM, Stephen Nelson-Smith <sanelson at gmail.com>
> wrote:
>>
>> And the problem I have with the below is that I've discovered that the
>> input logfiles aren't strictly ordered - ie there is variance by a
>> second or so in some of the entries.
>
> Within a given set of 10 lines, is the first line and last line "in order" -
On average, in a sequence of 10 log lines, one will be out by one or
two seconds.
Here's a random slice:
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:36
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:36
05/Nov/2009:01:41:37
05/Nov/2009:01:41:37
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:37
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:37
05/Nov/2009:01:41:38
05/Nov/2009:01:41:36
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:38
05/Nov/2009:01:41:39
05/Nov/2009:01:41:38
05/Nov/2009:01:41:39
05/Nov/2009:01:41:39
05/Nov/2009:01:41:39
05/Nov/2009:01:41:39
05/Nov/2009:01:41:40
05/Nov/2009:01:41:40
05/Nov/2009:01:41:41
> I don't know
> what the default python sorting algorithm is on a list, but AFAIK you'd be
> looking at a constant O(log 10)
I'm not a mathematician - what does this mean, in layperson's terms?
> log_generator = (d for d in logdata)
> mylist = # first ten values
OK
> while True:
> try:
> mylist.sort()
OK - sort the first 10 values.
> nextdata = mylist.pop(0)
So the first value...
> mylist.append(log_generator.next())
Right, this will add another one value?
> except StopIteration:
> print 'done'
> Or now that I look, python has a priority queue (
> http://docs.python.org/library/heapq.html ) that you could use instead. Just
> push the next value into the queue and pop one out - you give it some
> initial qty - 10 or so, and then it will always give you the smallest value.
That sounds very cool - and I see that one of the activestate recipes
Kent suggested uses heapq too. I'll have a play.
S.
--
Stephen Nelson-Smith
Technical Director
Atalanta Systems Ltd
www.atalanta-systems.com
More information about the Tutor
mailing list