affects on extended modules
Pedro
pedro_rodriguez at club-internet.fr
Fri Dec 28 05:19:56 EST 2001
"Curtis Jensen" <cjensen at bioeng.ucsd.edu> wrote:
> Pedro <pedro_rodriguez at club-internet.fr> wrote in message
> news:<pan.2001.12.06.12.45.41.172197.2456 at club-internet.fr>...
>> "Curtis Jensen" <cjensen at bioeng.ucsd.edu> wrote:
>>
>> > Kragen Sitaker wrote:
>> >>
>> >> Curtis Jensen <cjensen at bioeng.ucsd.edu> writes:
>> >> > We have created a python interface to some core libraries of our
>> >> > own making. We also have a C interface to these same libraries.
>> >> > However, the the python interface seems to affect the speed of the
>> >> > extended libraries. ie. some library routines have their own
>> >> > benchmark code, and the time of exection from the start of the
>> >> > library routine to the end of the library routine (not including
>> >> > any python code execution), takes longer than it's C counterpart.
>> >>
>> >> In the Python version, the code is in a Python extension module,
>> >> right?
>> >> A .so or .dll file? Is it also in the C counterpart? (If that's
>> >> not
>> >> it, can you provide more details on how you compiled and linked the
>> >> two?)
>> >>
>> >> In general, referring to dynamically loaded things through symbols
>> >> --- even from within the same file --- tends to be slower than
>> >> referring to things that aren't dynamically loaded.
>> >>
>> >> What architecture are you on? If you're on the x86, maybe Numeric
>> >> is being stupid and allocating things that aren't maximally aligned.
>> >> But you'd probably notice a pretty drastic difference in that case.
>> >>
>> >> ... or maybe Numeric is being stupid and allocating things in a way
>> >> that causes cache-line contention.
>> >>
>> >> Hope this helps.
>> >
>> > Thanks for the responce. The C counterpart is directly linked
>> > together into one large binary (yes, the python is using a dynamicaly
>> > linked object file, a .so). So, That might be the source of the
>> > problem. I can try and make a dynamicaly linked version of the C
>> > counterpart and see how that affects the speed. We are running on
>> > IRIX 6.5 machines (mips).
>> > Thanks.
>> >
>> >
>> Don't know if this helps but I had a similar problem on Linux.
>>
>> The context was : a python script was calling an external program and
>> parsing output (with popen) many times. I decided to optimize this by
>> turning the external program into a dynamicaly linked library with
>> python bindings. I expected to gain the extra system calls to fork and
>> start a new process, but it turned out that this solution was slower.
>>
>> The problem was caused by multithreading stuff. When using the library
>> straight from a C program, I didn't link with multithreaded libraries
>> and so all system calls weren't protected (they don't need to lock and
>> unlock their resources).
>>
>> Unfortunately, the library was reading files with fgetc (character by
>> character :( ). Since the Python version I used was compiled with
>> multi-threading enabled, it turned out that the fgetc function used in
>> this case lock/unlock features, which cause the extra waste of time.
>>
>> To find this, I compiled my library with profiling (I think I needed to
>> use some system call to activate profiling from the library, since I
>> couldn't rebuild Python).
>>
>> OT : at the end I fixed the library (fgetc replaced by fgets), and
>> didn't gain anything by turning the external program into a python
>> extension. Since it seemed that Linux disk cache was good, I removed
>> the python extension thus keeping a pure Python program, and
>> implemented a cache for the results of the external program. This was
>> much simpler and more efficient in this case.
>
>
> Is this a problem with i/o only? Our the code sections that we
> benchmarked has no i/o in it.
>
> --
> Curtis Jensen
In my case, it was only i/o related.
If your problem, as I understand it, is :
+ I've got a function f() written in C
+ f() execution is doing some benchmark telling how much time it took
to complete
+ calling f() from a C binary gives a (significant) shorter duration
than calling (the same) f() from a Python extension
you may have to check what f() is doing, because , what I was stating is,
that it may be affected by the python environment :
- Are doing extensive calls to an external library ?
In my case, some glibc calls need to inforce reentrancy protection
when running in a multithreaded context. These protections blew out
any gain.
- If you're doing calls to external libraries, are you linked against
the same versions ? (ldd on binaries and libraries may help)
- More basicaly, did you compile with the same options ?
Could the differences point to a possible source of your problem ?
(may be worth checking optimization, debug, conditional compilation
options)
Regards,
--
Pedro
More information about the Python-list
mailing list