[Tutor] script still too slow

Paul Tremblay phthenry@earthlink.net
Thu Feb 27 11:06:04 2003


After re-writing parts of my script for 8 hours yesterday, it is still
almost 3 times slower than its perl counterpart. 

The script reads in an RTF file and then breaks it into tokens. It then
makes several passes through the file, reading each token and performing
the appropriate action.

I have profiled the script using Python's profile utility. There is no
one function that slows down the script. Rather, each function seems to
take as much time. Here is an example:


    def evaluate_token(self,token):
        """Evaluate tokens. Return a value if the token is not a 
        control word. Otherwise, pass token onto another method
        for further evaluation."""
        if token == '{':
            self.__bracket_count = self.__bracket_count + 1
            num = '%04d' % self.__bracket_count
            token = 'ob<nu<nu<nu<%(num)s<{\n' % vars()
        elif token == '}':
            num = '%04d' % self.__bracket_count
            token = 'cb<nu<nu<nu<%(num)s<}\n' % vars()
            self.__bracket_count = self.__bracket_count - 1
        elif token == r'\{':
            token = 'tx<es<nu<nu<nu<{\n'
        elif token == r'\}':
            token = 'tx<es<nu<nu<nu<}\n'
        elif token == r'\\': # double or escaped \
            token = 'tx<es<nu<nu<nu<\\\n'
        elif token[0:1] != '\\': # single \
            token = 'tx<nu<nu<nu<nu<%(token)s\n' % vars()
        else:
            token = self.evaluate_cw(token)


        return token

This function takes around 17 seconds to run. The tokens have to be
further evaluated in many cases with two other functions. Each of these
functions takes 17 seconds as well.

So far, I have only conerted the first two parts of the script from
python to perl. So far, these first two steps take 3 times as long as
perl. At this rate, the script could take over 6 minutes to process a
file. That is much too long.

I am beginning to think that python is just much slower than perl, and
shouldn't be used for this task?

Sorry I am vague on asking specifics, but I can't really pin point the
exact problem with speed, and obviously I can't dump hundreds of lines
on the mailing list.

Thanks

Paul

-- 

************************
*Paul Tremblay         *
*phthenry@earthlink.net*
************************