Pure python implementation of string-like class

Akihiro KAYAMA kayama at st.rim.or.jp
Fri Feb 24 20:26:45 EST 2006


Hi all.

I would like to ask how I can implement string-like class using tuple
or list. Does anyone know about some example codes of pure python
implementation of string-like class?

Because I am trying to use Python for a text processing which is
composed of a large character set. As the character set is wider than
UTF-16(U+10FFFF), I can't use Python's native unicode string class.

So I want to prepare my own string class, which provides convenience
string methods such as split, join, find and others like usual string
class, but it uses a sequence of integer as a internal representation
instead of a native string.  Obviously, subclassing of str doesn't
help.

The implementation of each string methods in the Python source
tree(stringobject.c) is far from python code, so I have started from
scratch, like below:

    def startswith(self, prefix, start=-1, end=-1):
        assert start < 0, "not implemented"
        assert end < 0, "not implemented"
        if isinstance(prefix, (str, unicode)):
            prefix = MyString(prefix)
        n = len(prefix)
        return self[0:n] == prefix

but I found it's not a trivial task for myself to achive correctness
and completeness. It smells "reinventing the wheel" also, though I
can't find any hints in google and/or Python cookbook.

I don't care efficiency as a starting point. Any comments are welcome.
Thanks.

-- kayama



More information about the Python-list mailing list