number of different lines in a file
r.e.s.
r.s at ZZmindspring.com
Thu May 18 17:51:30 EDT 2006
I have a million-line text file with 100 characters per line,
and simply need to determine how many of the lines are distinct.
On my PC, this little program just goes to never-never land:
def number_distinct(fn):
f = file(fn)
x = f.readline().strip()
L = []
while x<>'':
if x not in L:
L = L + [x]
x = f.readline().strip()
return len(L)
Would anyone care to point out improvements?
Is there a better algorithm for doing this?
More information about the Python-list
mailing list