Markov Analysis Help

Sat May 17 00:52:59 EDT 2008

Hi Guys,

I've written a Markov analysis program and would like to get your 
comments on the code  As it stands now the final input comes out as a 
tuple, then list, then tuple.  Something like ('the', 'water') ['us'] 
('we', 'took')..etc...

I'm still learning so I don't know any advanced techniques or methods 
that may have made this easier.

here's the code:

def makelist(f): 	#turn a document into a list
	fin = open(f)
	results = []
	for line in fin:
                line = line.replace('"', '')
		line = line.strip().split()
		for word in line:
			results.append(word)
	return results

def markov(f, preflen=2):	#f is the file to analyze, preflen is prefix length
	convert_file = makelist(f)
	mapdict = {}		#dict where the prefixes will map to suffixes
	start = 0
	end = preflen 		#start/end set the slice size
	for words in convert_file:
		prefix = tuple(convert_file[start:end])     #tuple as mapdict key
		suffix = convert_file[start + 2 : end + 1]  #word as suffix to key
		mapdict[prefix] = mapdict.get(prefix, []) + suffix #append suffixes
		start += 1
		end += 1
	return mapdict

def randsent(f, amt=10):     #prints a random sentence
        analyze = markov(f)
	for i in range(amt):
		rkey = random.choice(analyze.keys())
		print rkey, analyze[rkey],

The book gave a hint  saying to make the prefixes in the dict using:

def shift(prefix, word):
	return prefix[1:] + (word, )

However I can't seem to wrap my head around incorporating that into the 
code above, if you know a method or could point me in the right 
direction (or think that I don't need to use it) please let me know.

Thanks for all your help,

Dave