Newbie with sort text file question
Andrew Dalke
adalke at mindspring.com
Sun Jul 13 17:12:12 EDT 2003
Bob Gailer:
> [Pipeline]
Huh. Hadn't heard of that one before. Thanks for the pointer.
(And overall, nice post!)
> The Python version:
Some stylistic comments
> input = file('c:\input.txt')
Since 'input' is a builtin, I use 'infile'. That's only a preference of
mine.
For the OP, you'll need 'c:\\input.txt' because the '\' has special meaning
inside of a string so must be escaped.
> fruits = {} # a dictionary to hold each fruit and its count
> lines = input.readlines()
> for line in lines:
Since you are using Python 2.2 (later you use "if fruit in fruits",
and "__in__" support for dicts wasn't added until Python 2.2, I
think, and the 'file' usage is also new), this is best written as
for line in input:
> fruit = line.split('_', 1)[0]
> if fruit in fruits:
> fruits[fruit] += 1 # increment count
> else:
> fruits[fruit] = 1 # add to dictionary with count of 1
Here's a handy idiom for what you want
fruits[fruit] = fruits.get(fruit, 0) + 1
> output1 = file('c:\output1.txt', 'w')
> for key, value in fruits.items():
> output1.write("%s occurs %s\n" % (key, value))
> output1.close()
> output2 = file('c:\output2.txt', 'w')
> output2.write("Total occurrences is %s\n" % len(lines))
> output2.close()
That's missing some sorts, so I don't think it meets the OP's
requirements.
How about this?
infile = open("input.txt")
lines = []
counts = {}
for line in infile:
lines.append(line)
fruit = line.split("_", 1)[0]
counts[fruit] = counts.get(fruit) + 1
# Sort by name. Since "_" sorts after any letter, this means
# that "plum_" will be placed *after* "plumbago_", which
# is probably not what you want. Left as an exercise :)
lines.sort()
outfile = open("output1.txt")
for line in lines:
outfile.write(line)
outfile.close()
# Print counts from highest count to lowest
count_data = [(n, fruit) for (fruit, n) in counts.items()]
count_data.sort()
outfile = open("output2.txt")
total = 0
for n, fruit in count_data:
outfile.write("%s occurs %s\n" % (fruit, n))
total += n
outfile.write("\nTotal occurances: %s\n" % total)
outfile.close()
Andrew
dalke at dalkescientific.com
More information about the Python-list
mailing list