How can I create customized classes that have similar properties as 'str'?

samwyse samwyse at gmail.com
Sat Nov 24 12:38:58 EST 2007


On Nov 24, 5:44 am, Licheng Fang <fanglich... at gmail.com> wrote:
> Yes, millions. In my natural language processing tasks, I almost
> always need to define patterns, identify their occurrences in a huge
> data, and count them. Say, I have a big text file, consisting of
> millions of words, and I want to count the frequency of trigrams:
>
> trigrams([1,2,3,4,5]) == [(1,2,3),(2,3,4),(3,4,5)]

BTW, if the components of your trigrams are never larger than a byte,
then encode the tuples as integers and don't worry about pointer
comparisons.

>>> def encode(s):
	return (ord(s[0])*256+ord(s[1]))*256+ord(s[2])

>>> def trigram(s):
	return [ encode(s[i:i+3]) for i in range(0, len(s)-2)]

>>> trigram('abcde')
[6382179, 6447972, 6513765]



More information about the Python-list mailing list