[Tutor] database question

jpollack@socrates.Berkeley.EDU jpollack@socrates.Berkeley.EDU
Mon Jul 28 13:32:25 2003


Hi,

For all of you pythonistas that use the language to handle large amounts
of data, especially in a bioinformatic context:

I'm pretty new to the language and up to this point, I've been able to
make do with storing data in a text file.  However, I'm building a tool
that digs out data from the Genome Browser and then excises what I think
to be putative promoter regions.  My question is this:

The entirety of the file I'm generating is about 223 megabytes.  Trying to
read in a text file of this size is just ridiculous.  I think this
probably begs for a database application, but I"ve never used any before.
I really don't know any SQL either, aside from the basic concept.

Does anyone have any thoughts about what kind of database is python
friendly and would be easy to use for a beginner?  Also, I assume there
are python modules out there for populating and querying such things?


The only fields I would have would be: Accession Number, Chromosome,
Strand State, Hits (different database calls for the transcription start
site... all integers), and the sequence (this is about 7000 bytes
generally).


Thanks so much,

Joshua Pollack