Parse ASCII log ; sort and keep most recent entries

David Fisher fishboy at SPAMredSPAMpeanutSPAM.com
Wed Jun 16 21:11:59 EDT 2004


novastaylor at hotmail.com (Nova's Taylor) writes:

> Hi folks,
> 
> I am a newbie to Python and am hoping that someone can get me started
> on a log parser that I am trying to write.
> 
> The log is an ASCII file that contains a process identifier (PID),
> username, date, and time field like this:
> 
> 1234 williamstim 01AUG03 7:44:31               
> 2348 williamstim 02AUG03 14:11:20              
> 23 jonesjimbo 07AUG03 15:25:00                 
> 2348 williamstim 17AUG03 9:13:55               
> 748 jonesjimbo 13OCT03 14:10:05                
> 23 jonesjimbo 14OCT03 23:01:23                 
> 748 jonesjimbo 14OCT03 23:59:59  
> 
> I want to read in and sort the file so the new list only contains only
> the most the most recent PID (PIDS get reused often). In my example,
> the new list would be:
> 
> 1234 williamstim 01AUG03 7:44:31               
> 2348 williamstim 17AUG03 9:13:55               
> 23 jonesjimbo 14OCT03 23:01:23                 
> 748 jonesjimbo 14OCT03 23:59:59  
> 
> So I need to sort by PID and date + time,then keep the most recent. 
> 
> Any help would be appreciated!
> 
> Taylor
> 
> NovasTaylor at hotmail.com
#!/usr/bin/env python
#
# I'm expecting the log file to be in chronalogical order
# so later entries are later in time
# using the dict, later PIDs overwrite newer ones.
# make a script and use this like
# logparse.py mylogfile.log > newlogfile.log
#
import fileinput
piddict = {}
for line in fileinput:
        pid,username,date,time = line.split()
        piddict[pid] = (username,date,time)
#
pidlist = piddict.keys()
pidlist.sort()
for pid in pidlist:
        username,date,time = piddict[pid]
        print pid,username,date,time
#tada!



More information about the Python-list mailing list