You could create a dict that maps every word to the feature index - then for creating the feature vector you simply create a list with size of the # of words in the dict an set in this list the corresponding index. -- Regards, Diez B. Roggisch