help in algorithm

Paolino paolo_veronelli at tiscali.it
Wed Aug 10 10:51:55 EDT 2005


I have  a self organizing net which aim is clustering words.
Let's think the clustering is about their 2-grams set.
Words then are instances of this class.

class clusterable(str):
   def __abs__(self):# the set of q-grams (to be calculated only once)
     return set([(self+self[0])[n:n+2] for n in range(len(self))])
   def __sub__(self,other): # the q-grams distance between 2 words
     set1=abs(self)
     set2=abs(other)
     return len(set1|set2)-len(set1&set2)

I'm looking  for the medium  of a set of words, as the word  which 
minimizes the sum of the distances from those words.

Aka:sum([medium-word for word in words])


Thanks for ideas, Paolino



	

	
		
___________________________________ 
Yahoo! Mail: gratis 1GB per i messaggi e allegati da 10MB 
http://mail.yahoo.it



More information about the Python-list mailing list