python and very large data sets???
Aahz
aahz at pythoncraft.com
Wed Apr 24 18:26:09 EDT 2002
In article <mailman.1019682346.19715.python-list at python.org>,
holger krekel <pyth at devel.trillke.net> wrote:
>
>I just don't happen to see the advantages of bringing a database into
>the picture. It seems like a classical batch job and it 'random access
>many times' is not needed, so why?
>From the original post:
Things would afterwards get more complicated cause I will have to
pullout ID's from "sub_file1", remove duplicate ID's create
"no_dup_sub_file1", match those to ID's in remaining 3 main files and
pullout data linked with those ID's.
This screams "*JOIN*" to me. Now, if sub_file1 is less than 100MB,
*maybe* Python can handle it. IMO, that is true IIF the records are
strictly fixed-length. But IME requirements will change such that joins
over larger and larger datasets will be needed -- and why re-invent a
database that's designed precisely for this purpose?
--
Aahz (aahz at pythoncraft.com) <*> http://www.pythoncraft.com/
What if there were no rhetorical questions?
More information about the Python-list
mailing list