Parse ASCII log ; sort and keep most recent entries
Larry Bates
lbates at swamisoft.com
Wed Jun 16 19:59:19 EDT 2004
Here's a quick solution.
Larry Bates
Syscon, Inc.
def cmpfunc(x,y):
xdate=x[0]
xtime=x[1]
ydate=y[0]
ytime=y[1]
if xdate == ydate:
#
# If the two dates are equal, I must check the times
#
if xtime > ytime: return 1
elif xtime == ytime: return 0
else: return -1
elif xdate > ydate: return 1
return -1
fp=file(yourlogfilepath, 'r')
lines=fp.readlines()
fp.close()
list=[]
months={'JAN': '01', 'FEB': '02', 'MAR': '03', 'APR': '04',
'MAY': '05', 'JUN': '06', 'JUL': '07', 'AUG': '08',
'SEP': '09', 'OCT': '10', 'NOV': '11', 'DEC': '12'}
logdict={}
for line in lines:
if not line.strip(): break
print line
pid, name, date, time=[x.strip() for x in line.rstrip().split(' ')]
#
# Must zero pad time for proper comparison
#
stime=time.zfill(8)
#
# Must reformat the data as YYMMDD
#
sdate=date[-2:]+months[date[2:5]]+date[:2]
list.append((sdate, stime, pid, name, date, time))
list.sort(cmpfunc)
list.reverse()
for sdate, stime, pid, name, date, time in list:
if logdict.has_key(pid): continue
logdict[pid]=(pid, name, date, time)
for key in logdict.keys():
pid, name, date, time=logdict[key]
print pid, name, date, time
"Nova's Taylor" <novastaylor at hotmail.com> wrote in message
news:fda4b581.0406161306.c5de18f at posting.google.com...
> Hi folks,
>
> I am a newbie to Python and am hoping that someone can get me started
> on a log parser that I am trying to write.
>
> The log is an ASCII file that contains a process identifier (PID),
> username, date, and time field like this:
>
> 1234 williamstim 01AUG03 7:44:31
> 2348 williamstim 02AUG03 14:11:20
> 23 jonesjimbo 07AUG03 15:25:00
> 2348 williamstim 17AUG03 9:13:55
> 748 jonesjimbo 13OCT03 14:10:05
> 23 jonesjimbo 14OCT03 23:01:23
> 748 jonesjimbo 14OCT03 23:59:59
>
> I want to read in and sort the file so the new list only contains only
> the most the most recent PID (PIDS get reused often). In my example,
> the new list would be:
>
> 1234 williamstim 01AUG03 7:44:31
> 2348 williamstim 17AUG03 9:13:55
> 23 jonesjimbo 14OCT03 23:01:23
> 748 jonesjimbo 14OCT03 23:59:59
>
> So I need to sort by PID and date + time,then keep the most recent.
>
> Any help would be appreciated!
>
> Taylor
>
> NovasTaylor at hotmail.com
More information about the Python-list
mailing list