How can I create customized classes that have similar properties as 'str'?

samwyse samwyse at
Sat Nov 24 12:38:58 EST 2007

On Nov 24, 5:44 am, Licheng Fang <fanglich... at> wrote:
> Yes, millions. In my natural language processing tasks, I almost
> always need to define patterns, identify their occurrences in a huge
> data, and count them. Say, I have a big text file, consisting of
> millions of words, and I want to count the frequency of trigrams:
> trigrams([1,2,3,4,5]) == [(1,2,3),(2,3,4),(3,4,5)]

BTW, if the components of your trigrams are never larger than a byte,
then encode the tuples as integers and don't worry about pointer

>>> def encode(s):
	return (ord(s[0])*256+ord(s[1]))*256+ord(s[2])

>>> def trigram(s):
	return [ encode(s[i:i+3]) for i in range(0, len(s)-2)]

>>> trigram('abcde')
[6382179, 6447972, 6513765]

More information about the Python-list mailing list