[Tutor] script too slow

Sean 'Shaleh' Perry shalehperry@attbi.com
Sun Feb 23 13:54:02 2003


On Sunday 23 February 2003 09:57, Paul Tremblay wrote:
>
> I have completed two parts of the script. The first part uses regular
> expressions to break each line into tokens.
>
> perl =3D> 20 seconds
> python =3D> 45 seconds
>
> Not surprisingly, python ran slower than perl, which is designed around
> regular expressions. However, the next part proved very disappointing t=
o
> me. This part reads each token, and determines if it is in a dictionary=
,
> and takes action if it is.
>

if you precompile the regex the two often come much closer.  Especially i=
f you=20
use the same ones over and over.

is_entity =3D re.compile(r'&\w+;')

if is_entity.search(input):
  handle_entity(input)

>
>     # now use the dictionaries to process the tokens
>     def process_cw(self, token, str_token, space):
>         """Change the value of the control word by determing what
> dictionary it belongs to"""
>
>         if token_changed =3D=3D '*':
>             pass
>         elif self.needed_bool.has_key(token_changed):
>             token_changed =3D self.needed_bool[token_changed]
>         elif self.styles_1.has_key(token_changed):
>             token_changed =3D self.styles_1[token_changed]
>         elif self.styles_2.has_key(token_changed):
>             token_changed =3D self.styles_2[token_changed]
>             num =3D self.divide_num(num,2)
>
> =09# ect. There are around a dozen such statements
>
>
> It is this last function, the "def process_cw", that eats up all the
> clock time. If I skip over this function, then I chop around 30 seconds
> off the script.
>

I do not see anything glaring here.

> I am wondering if I should make the dictionaries a part of the class
> property rather than a property of the actual instance? That is, put
> then at the top of the class, and then access them with
>

if all instances share the same data this makes sense from a design=20
perspective.  However I do not believe it will improve the performance an=
y.

>
> The dictionary part of the scrpt seems so slow, that I am guessing I am
> doing something wroing, that Python has to read in the dictionary each
> time it starts the function.
>

Perhaps you could reorder the if statements so that the most commonly hit=
=20
cases are the first ones checked.  If you average 6 compares and can redu=
ce=20
that to 2 you should see a decent improvement.

Seeing the perl code may also help us see why the run time is so differen=
t.