String concatenation - which is the fastest way ?

przemolicc at poczta.fm przemolicc at poczta.fm
Thu Aug 11 07:52:55 EDT 2011


On Thu, Aug 11, 2011 at 11:59:31AM +0100, Chris Angelico wrote:
> On Thu, Aug 11, 2011 at 7:40 AM,  <przemolicc at poczta.fm> wrote:
> > I am not a database developer so I don't want to change the whole process
> > of data flow between applications in my company. Another process is
> > reading this XML from particular Oracle table so I have to put the final XML there.
> 
> I think you may be looking at a submission to
> http://www.thedailywtf.com/ soon. You seem to be working in a rather
> weird dataflow. :( Under the circumstances, you're probably going to
> want to go with the original ''.join() option.
> 
> > This server has 256 GB of RAM so memory is not a problem.
> > Also the select which fetches the data is sorted. That is why I have to
> > carefully divide into subtasks and then merge it in correct order.
> 
> There's no guarantee that all of that 256GB is available to you, of course.

I am the admin of this server - the memory is available for us :-)

> What may be the easiest way is to do the select in a single process,
> then partition it and use the Python multiprocessing module to split
> the job into several parts. Then you need only concatenate the handful
> of strings.

This is the way I am going to use.

> You'll need to do some serious profiling, though, to ascertain where
> the bottleneck really is. Is it actually slow doing the concatenation,
> or is it taking more time reading/writing the disk? Is it actually all
> just taking time due to RAM usage? Proper string concatenation doesn't
> need a huge amount of CPU.

I did my homework :-) - the CPU working on concatenation is a bottleneck.


Regards
Przemyslaw Bak (przemol)


















































----------------------------------------------------------------
Dziesiatki tysiecy ofert domow i mieszkan!
Ogladaj >> http://linkint.pl/f2a0c



More information about the Python-list mailing list