Hi reliability files, writing,reading and maintaining

Scott David Daniels scott.daniels at acm.org
Tue Feb 7 18:19:37 EST 2006


John Pote wrote:
> Hello, help/advice appreciated.
> I am writing some web scripts in python to receive small amounts of data 
> from remote sensors and store the data in a file. 50 to 100 bytes every 5 or 
> 10 minutes. A new file for each day is anticipated.  Of considerable 
> importance is the long term availability of this data and it's gathering and 
> storage without gaps.

This looks to me like the kind of thing a database is designed to
handle.  File systems under many operating systems have a nasty
habit of re-ordering writes for I/O efficiency, and don't necessarily
have the behavior you need for your application.  The "ACID" design
criteria for database design ask that operations on the DB are:
     Atomic
     Consistent
     Independent
     Durable
"Atomic" means that the database always appears as if the "transaction"
has either happened or not; it is not possible for any transaction to
see the DB with any transaction in a semi-completed state.  "Consistent"
says that if you have invariants that are true about the data in the
database, and each transaction preserves the invariants, the database
will always satisfy the invariants.  "Independent" essentially says that
no transaction (such as reading the DB) will be able to tell it is
running in parallel with other transactions (such as reads).  "Durable"
says that, once a transaction has been committed, even pulling the plug
and restarting the DBMS should give a database with those transactions
which got committed there, and no pieces of any other there.

Databases often provide pre-packaged ways to do backups while the DB
is running. These considerations are the core considerations to database
design, so I'd suggest you consider using a DB for your application.

I do note that some of the most modern operating systems are trying
to provide "log-structured file systems," which may help with the
durability of file writes.  I understand there is an attempt even to
provide transactional interactions to the file systems, but I'm not
sure how far down the line that goes.

-- 
-Scott David Daniels
scott.daniels at acm.org



More information about the Python-list mailing list