Markov Analysis Help
dave
squareswallower at 1ya2hoo3.net
Sat May 17 00:52:59 EDT 2008
Hi Guys,
I've written a Markov analysis program and would like to get your
comments on the code As it stands now the final input comes out as a
tuple, then list, then tuple. Something like ('the', 'water') ['us']
('we', 'took')..etc...
I'm still learning so I don't know any advanced techniques or methods
that may have made this easier.
here's the code:
def makelist(f): #turn a document into a list
fin = open(f)
results = []
for line in fin:
line = line.replace('"', '')
line = line.strip().split()
for word in line:
results.append(word)
return results
def markov(f, preflen=2): #f is the file to analyze, preflen is prefix length
convert_file = makelist(f)
mapdict = {} #dict where the prefixes will map to suffixes
start = 0
end = preflen #start/end set the slice size
for words in convert_file:
prefix = tuple(convert_file[start:end]) #tuple as mapdict key
suffix = convert_file[start + 2 : end + 1] #word as suffix to key
mapdict[prefix] = mapdict.get(prefix, []) + suffix #append suffixes
start += 1
end += 1
return mapdict
def randsent(f, amt=10): #prints a random sentence
analyze = markov(f)
for i in range(amt):
rkey = random.choice(analyze.keys())
print rkey, analyze[rkey],
The book gave a hint saying to make the prefixes in the dict using:
def shift(prefix, word):
return prefix[1:] + (word, )
However I can't seem to wrap my head around incorporating that into the
code above, if you know a method or could point me in the right
direction (or think that I don't need to use it) please let me know.
Thanks for all your help,
Dave
More information about the Python-list
mailing list