Using Python for processing of large datasets (convincing managment)
Thomas Jensen
spam at ob_scure.dk
Sat Jul 6 17:18:02 EDT 2002
William Park wrote:
[snip]
> If your cronjob can tackle 1MB but not 1GB, then I don't think this is
> programming language issue. Rather, you should look at your algorithm and
> data structure.
I am inclined to agree. The current implementation is very inefficient
in it's database aceess (lots of small queries with no supporting
indexes, furthermore the same data is often read multiple times).
> If your company is private for-profit company, then use money argument:
It is.
>
> - Anyone who knows Python or Unix shell will have the necessary
> analytical skills. And, there can easily be found on
> <comp.lang.python> or <comp.unix.shell>.
The company is based in Denmark, and I belive that the amount of Danish
people in the group is rather small?
However I recently heard that some danish universities uses Python as
the primary language in CS.
> - Scaling to multiple CPU is OS issue. Much easier with Linux (no
> comment on Windows :-)
I've heard that Python threads don't scale (well?) to multiple CPUs ?
Maybe that's only on Windows?
I was planning on (be it Python) using XML-RPC for
inter-process/-machine communications.
> - Scaling to GB is algorithm issue. Python makes development easier,
> because it's easy to write and read.
>
> Mostly, he saves money because he will be able to find right people. The
> fact that they happen to know the right language is just bonus.
Well said.
--
Best Regards
Thomas Jensen
(remove underscore in email address to mail me)
More information about the Python-list
mailing list