object references/memory access

Steve Holden steve at holdenweb.com
Mon Jul 2 21:32:43 EDT 2007


Karthik Gurusamy wrote:
> On Jul 2, 3:01 pm, Steve Holden <s... at holdenweb.com> wrote:
>> Karthik Gurusamy wrote:
>>> On Jul 1, 12:38 pm, dlomsak <dlom... at gmail.com> wrote:
>> [...]
>>
>>> I have found the stop-and-go between two processes on the same machine
>>> leads to very poor throughput. By stop-and-go, I mean the producer and
>>> consumer are constantly getting on and off of the CPU since the pipe
>>> gets full (or empty for consumer). Note that a producer can't run at
>>> its top speed as the scheduler will pull it out since it's output pipe
>>> got filled up.
>> But when both processes are in the memory of the same machine and they
>> communicate through an in-memory buffer, what's to stop them from
>> keeping the CPU fully-loaded (assuming they are themselves compute-bound)?
> 
> If you are a producer and if your output goes thru' a pipe, when the
> pipe gets full, you can no longer run. Someone must start draining the
> pipe.
> On a single core CPU when only one process can be running, the
> producer must get off the CPU so that the consumer may start the
> draining process.
> 
Wrong. The process doesn't "get off" the CPU, it remains loaded, and 
will become runnable again once the buffer has been depleted by the 
other process (which is also already loaded into memory and will become 
runnable as soon as a filled buffer becomes available).

>>> When you increased the underlying buffer, you mitigated a bit this
>>> shuffling. And hence saw a slight increase in performance.
>>> My guess that you can transfer across machines at real high speed, is
>>> because there are no process swapping as producer and consumer run on
>>> different CPUs (machines, actually).
>> As a concept that's attractive, but it's easy to demonstrate that (for
>> example) two machines will get much better throughput using the
>> TCP-based FTP to transfer a large file than they do with the UDP-based
>> TFTP. This is because the latter protocol requires the sending unit to
>> stop and wait for an acknowledgment for each block transferred. With
>> FTP, if you use a large enough TCP sliding window and have enough
>> content, you can saturate a link as ling as its bandwidth isn't greater
>> than your output rate.
>>
>> This isn't a guess ...
> 
> What you say about a stop-n-wait protocol versus TCP's sliding window
> is correct.
> But I think it's totally orthogonal to the discussion here. The issue
> I'm talking about is how to keep the end nodes chugging along, if they
> are able to run simultaneously. They can't if they aren't on a multi-
> core CPU or one different machines.
> 
If you only have one CPU then sure, you can only run one process at a 
time. But your understanding of how multiple processes on the same CPU 
interact is lacking.
> 
>>> Since the two processes are on the same machine, try using a temporary
>>> file for IPC. This is not as efficient as real shared memory -- but it
>>> does avoid the IPC stop-n-go. The producer can generate the multi-mega
>>> byte file at one go and inform the consumer. The file-systems have
>>> gone thru' decades of performance tuning that this job is done really
>>> efficiently.
>> I'm afraid this comes across a bit like superstition. Do you have any
>> evidence this would give superior performance?
>>
> 
> I did some testing before when I worked on boosting a shell pipeline
> performance and found using file-based IPC was very good.
> (some details at http://kar1107.blogspot.com/2006/09/unix-shell-pipeline-part-2-using.html
> )
> 
> Thanks,
> Karthik
> 
>>>> Thanks for the replies so far, I really appreciate you guys
>>>> considering my situation and helping out.

If you get better performance by writing files and reading them instead 
of using pipes to communicate then something is wrong.

regards
  Steve
-- 
Steve Holden        +1 571 484 6266   +1 800 494 3119
Holden Web LLC/Ltd           http://www.holdenweb.com
Skype: holdenweb      http://del.icio.us/steve.holden
--------------- Asciimercial ------------------
Get on the web: Blog, lens and tag the Internet
Many services currently offer free registration
----------- Thank You for Reading -------------




More information about the Python-list mailing list