Suggest more finesse, please. I/O and sequences.

Fri Mar 25 17:11:58 EST 2005

You might take advantage of the .get method on
dictionaries to rewrite:

wordsDic = {}
inFile = open( sys.argv[1] )
for word in inFile.read().split():
    if wordsDic.has_key( word ):
        wordsDic[word] = wordsDic[word] + 1
    else:
        wordsDic[word] = 1

as:

wordsDic = {}
inFile = open( sys.argv[1] )
for word in inFile.read().split():
    wordsDict[word]=wordsDict.get(word, 0)+1

and taking advantage of tuple expansion and % formatting

for pair in wordsLst:
    outFile.write( str( pair[1] ).rjust( 7 ) + " : " + str( pair[0] ) + "\n")

as

for word, count in wordsLst:
    outFile.write("%7s : %i\n" % (word, count))

I guess you assumed all your words were less than 7 characters long (which
I copied).

But there are many other "good" ways I'm sure.

Larry Bates

Qertoip wrote:
> Would you like to suggest me any improvements for the following code?
> I want to make my implementation as simple, as Python - native, as fine as
> possible.
> 
> I've written simple code, which reads input text file and creates words'
> ranking by number of appearence.
> 
> Code:
> ---------------------------------------------------------------------------
> import sys
> 
> def moreCommonWord( x, y ):
> 	if x[1] != y[1]:
> 		return cmp( x[1], y[1] ) * -1
> 	return cmp( x[0], y[0] )
> 
> wordsDic = {}
> inFile = open( sys.argv[1] )
> for word in inFile.read().split():
> 	if wordsDic.has_key( word ):
> 		wordsDic[word] = wordsDic[word] + 1
> 	else:
> 		wordsDic[word] = 1
> inFile.close()
> 
> wordsLst = wordsDic.items()
> wordsLst.sort( moreCommonWord )
> 
> outFile = open( sys.argv[2], 'w')
> for pair in wordsLst:
> 	outFile.write( str( pair[1] ).rjust( 7 ) + " : " + str( pair[0] ) + "\n" )
> outFile.close()
> ---------------------------------------------------------------------------
> 
> In particular, I don't like reading whole file just to split it. 
> It is easy to read by lines - may I read by words with that ease?
> 
> PS I've been learning Python since todays morning, so be understanding :>
>