Parse ASCII log ; sort and keep most recent entries

Peter Hansen peter at engcorp.com
Fri Jun 18 20:57:12 EDT 2004


Nova's Taylor wrote:

> This is what I wound up using:

Could I suggest part of my suggestion again?  See below:

> piddict = {}
> for line in sourceFile:
>         pid,username,date,time = line.split()
>         piddict[pid] = (username,date,time)

Here you are splitting the whole thing, and storing a Python
tuple rather than the original "line" contents...

> pidlist = piddict.keys()
> pidlist.sort()
> for pid in pidlist:
>         username,date,time = piddict[pid]
>          # next line seems amateurish, but that is what I am!
>         logFile.write(pid + " " + username + " " + date + "" + time +
> "\n")

Here you are writing out something that is exactly equal
(if I read this all correctly) to the original line, but
having to split the tuple and append lots of strings together
again with spaces, the newline, etc.

Why not just store the original line and use it at the end:

for line in sourceFile:
     pid, _ = line.split(' ', 1)
     piddict[pid] = line

and later, use writelines as Christos suggested, without
even needing a loop:

logFile.writelines(piddict.values())

The difference in the writing part is that you are sorting by
pid, though I'm not clear why or if it's required.  If it is,
you could still loop, but more simply:

for pid in pidlist:
     logFile.write(piddict[pid])

No splitting, no concatenating...

-Peter



More information about the Python-list mailing list