profiler -- numeric overflow?

Tue Jun 27 07:38:14 EDT 2000

On Mon, 26 Jun 2000 23:36:19 -0700, kmself at ix.netcom.com
<kmself at ix.netcom.com> wrote:
>kmself at ix.netcom.com wrote:
>> I'm new to Python.

>> ...yes, I've also considered running a smaller sample of the log, but I
>> suspect that runtime may grow nonlinearly, large input samples are good.

>Followup to Self:

>...well, at least things appear to grow linearly.  Runtimes varying
>input lines:
>
>  Input     Runtime (sec)   Lines/s
>  -----     -------------   -------
>      100        0.210        476
>    1,000        3.860        259
>   10,000        41.30        242
>  100,000      415.670        240

>...which is darned close to straight-line growth.  However, performance
>is about an order of magnitude less than I'd like to see.

It's very likely to be linear, yes, unless you store things in lists or
dictionaries larger than available memory. (swapping massacres performance,
even in Python.) However, Python performance in the area of strings, and
regexps in particular, is nowhere near Perl or C. It's likely that critical
paths of the code can be optimized some, but that depends on code layout
(and ofcourse how optimal it is, already ;)

What *might* prove a significant improvement is using Python 1.6 and 'sre'
instead of 're'. The 'sre' module isn't finished, and I'm not sure how buggy
it is, but it's rumoured to be faster, in certain areas. (/F gave some
impressive numbers, but those are old, and might've been in another problem
domain entirely.)

Thomas.