object references/memory access

Mon Jul 2 16:16:51 EDT 2007

On Jul 1, 12:38 pm, dlomsak <dlom... at gmail.com> wrote:
> Thanks for the responses folks. I'm starting to think that there is
> merely an inefficiency in how I'm using the sockets. The expensive
> part of the program is definitely the socket transfer because I timed
> each part of the routine individually. For a small return, the whole
> search and return takes a fraction of a second. For a large return (in
> this case 21,000 records - 8.3 MB) is taking 18 seconds. 15 of those
> seconds are spent sending the serialized results from the server to
> the client. I did a little bit of a blind experiment and doubled the
> bytes on the client's socket.recv line. This improved the rate of
> transfer each time. The original rate when I was accepting 1024 bytes
> per recv took 47 seconds to send the 8.3 MB result. By doubling this
> size several times, I reduced the time to 18 seconds until doubling it
> further produced diminishing results. I was always under the
> impression that keeping the send and recv byte sizes around 1024 is a
> good idea and I'm sure that jacking those rates up is a lousy way to
> mitigate the transfer. It is also interesting to note that increasing
> the bytes sent per socket.send on the server side had no visible
> effect. Again, that was just a curious experiment.
>
> What bothers me is that I am sure sending data over the local loopback
> address should be blazing fast. 8.3 MB should be a breeze because I've
> transferred files over AIM to people connected to the same router as
> me and was able to send hundreds of megabytes in less than a two or
> three seconds. With that said, I feel like something about how I'm
> send/recv-ing the data is causing lots of overhead and that I can
> avoid reading the memory directly if I can speed that up.
>
> I guess now I'd like to know what are good practices in general to get
> better results with sockets on the same local machine. I'm only
> instantiating two sockets total right now - one client and one server,
> and the transfer is taking 15 seconds for only 8.3MB. If you guys have
> some good suggestions on how to better utilize sockets to transfer
> data at the speeds I know I should be able to achieve on a local
> machine, let me know what you do. At present, I find that using
> sockets in python requires very few steps so I'm not sure where I
> could really improve at this point.
>

I have found the stop-and-go between two processes on the same machine
leads to very poor throughput. By stop-and-go, I mean the producer and
consumer are constantly getting on and off of the CPU since the pipe
gets full (or empty for consumer). Note that a producer can't run at
its top speed as the scheduler will pull it out since it's output pipe
got filled up.

When you increased the underlying buffer, you mitigated a bit this
shuffling. And hence saw a slight increase in performance.

My guess that you can transfer across machines at real high speed, is
because there are no process swapping as producer and consumer run on
different CPUs (machines, actually).

Since the two processes are on the same machine, try using a temporary
file for IPC. This is not as efficient as real shared memory -- but it
does avoid the IPC stop-n-go. The producer can generate the multi-mega
byte file at one go and inform the consumer. The file-systems have
gone thru' decades of performance tuning that this job is done really
efficiently.

Thanks,
Karthik

> Thanks for the replies so far, I really appreciate you guys
> considering my situation and helping out.