[Tutor] script still too slow
Paul Tremblay
phthenry@earthlink.net
Sat Mar 1 03:00:02 2003
On Fri, Feb 28, 2003 at 10:14:12PM +0300, antonmuhin at rambler.ru wrote:
> 1. Psyco.
> 2. mxTexttools
>
> Both of them are rumored to be real accelerators. And even Python2.3
> might be of help as they say that every version of Python is faster.
As a matter of fact, I have mxTexttools installed. However, it is pretty
difficult to figure out how to use. The author wrote a small
demonstration script that actually converts RTF into tokens. However, I
couldn't get this script to work on large files, probably because the
tokens are stored in a dictionary, and I don't have enough memory (?).
In the future, I may try to figure out mxTexttools, since it might shave
from 5 to 10 percent off the time the script takes.
On the other hand, I have already written a kind of hack that uses perl
to form the tokens. Forming the tokens is a very small part of the
code--perhaps has little as 30 lines, or less than one percent.
I have written the script so that when the user installs and configures
it, s/he has the option of using perl to tokenize. I am guessing that
since 90 percent of users who have Python installed have perl installed,
this doesn't seem like such a bad idea. I have read that even C++
programs that rely on regular expressions use perl. I don't know how
true this is.
Perl is probably as fast or faster than mxTexttools. I don't say this
from personal experience, but from looking at comparison charts between
languages. But certainly perl is much, much easier to figure out than
mxTexttools!
I don't know what Psyco is. I'll have to google it and see.
Thanks
Paul
--
************************
*Paul Tremblay *
*phthenry@earthlink.net*
************************