python and very large data sets???

Aahz aahz at pythoncraft.com
Wed Apr 24 18:26:09 EDT 2002


In article <mailman.1019682346.19715.python-list at python.org>,
holger krekel  <pyth at devel.trillke.net> wrote:
>
>I just don't happen to see the advantages of bringing a database into
>the picture. It seems like a classical batch job and it 'random access
>many times' is not needed, so why?

>From the original post:

    Things would afterwards get more complicated cause I will have to
    pullout ID's from "sub_file1", remove duplicate ID's create
    "no_dup_sub_file1", match those to ID's in remaining 3 main files and
    pullout data linked with those ID's.

This screams "*JOIN*" to me.  Now, if sub_file1 is less than 100MB,
*maybe* Python can handle it.  IMO, that is true IIF the records are
strictly fixed-length.  But IME requirements will change such that joins
over larger and larger datasets will be needed -- and why re-invent a
database that's designed precisely for this purpose?
-- 
Aahz (aahz at pythoncraft.com)           <*>         http://www.pythoncraft.com/

What if there were no rhetorical questions?



More information about the Python-list mailing list