[Tutor] storing text in databases

Alan Gauld alan.gauld at btinternet.com
Thu Feb 22 10:58:40 CET 2007


"Chae M" <pine508 at hotmail.com> wrote
> Is it a reasonable idea to store text (a few hundred words each
> record) in a database?

That depends on what you want to do with it.
If you only wanmt to store the data and maybe do some searches
for words etc then no, a simply folder fiull of text files will be 
more
useful. (grep rules!)

But if you want to store and access information about the
text - say the author, date changed, etc or if you want to
dynamically generate longer texts from the smaller parts
then a data base could be reasonable. But databases are
generally more useful the more structured the data. Storing
freetext in a database without accompanying structured
data is not usually a big help.

> I'm learning to use the SQLite database

Try my Database topic in my tutor. It describes some theory,
some SQL and some Python. All based on SQLite

> that comes with Python, and it seems like a good idea in terms
> of being able to search for notes, etc.

A general text tool like grep might be more effective.

> - Can formatting of text be stored or is it always only raw text?

Formatting is a display thing and how it is implemented varies
a lot. HTML text has the formatting embedded. Some
wordprocessors put the formatting information at the start
or end of the file. Others use a separate file (not so common
nowadays). Many programs use non text values for formatting
codes and other programs might choke on them.

In essence the database will store the bytes you give it.
Whether you can display the formatted result when you extract
the data is another matter.

> - What is the upper limit on the size of a stored document?

It varies by database. Some limit text fields to 2000 characters,
others its much bigger (32KB or 64KB usd to be common on DOS)
Most will siupport a BLOB field that will hold more or less
anything, but you can't then search the content.

>  would this also work for big documents (10-100 pages?)
> - Is this a good idea/can it work?

The bigger the size of the data the less powerful the searches
you will be able to do and the longer they will take. Its usually
better to store the document in a file and use the database for
the meta-data and the filename.

-- 
Alan Gauld
Author of the Learn to Program web site
http://www.freenetpages.co.uk/hp/alan.gauld 




More information about the Tutor mailing list