[Tutor] script still too slow

Paul Tremblay phthenry@earthlink.net
Sat Mar 1 03:00:02 2003


On Fri, Feb 28, 2003 at 10:14:12PM +0300, antonmuhin at rambler.ru wrote:

> 1. Psyco.
> 2. mxTexttools
> 
> Both of them are rumored to be real accelerators. And even Python2.3
> might be of help as they say that every version of Python is faster.

As a matter of fact, I have mxTexttools installed. However, it is pretty
difficult to figure out how to use. The author wrote a small
demonstration script that actually converts RTF into tokens. However, I
couldn't get this script to work on large files, probably because the
tokens are stored in a dictionary, and I don't have enough memory (?). 

In the future, I may try to figure out mxTexttools, since it might shave
from 5 to 10 percent off the time the script takes.

On the other hand, I have already written a kind of hack that uses perl
to form the tokens. Forming the tokens is a very small part of the
code--perhaps has little as 30 lines, or less than one percent. 

I have written the script  so that when the user installs and configures
it, s/he has the option of using perl to tokenize. I am guessing that
since 90 percent of users who have Python installed have perl installed,
this doesn't seem like such a bad idea. I have read that even C++
programs that rely on regular expressions use perl. I don't know how
true this is.

Perl is probably as fast or faster than mxTexttools. I don't say this
from personal experience, but from looking at comparison charts between
languages. But certainly perl is much, much easier to figure out than
mxTexttools!

I don't know what Psyco is. I'll have to google it and see.

Thanks

Paul

-- 

************************
*Paul Tremblay         *
*phthenry@earthlink.net*
************************