Hi reliability files, writing,reading and maintaining

Larry Bates larry.bates at websafe.com
Tue Feb 7 20:20:02 EST 2006


John Pote wrote:
> Hello, help/advice appreciated.
> 
> Background:
> I am writing some web scripts in python to receive small amounts of data 
> from remote sensors and store the data in a file. 50 to 100 bytes every 5 or 
> 10 minutes. A new file for each day is anticipated.  Of considerable 
> importance is the long term availability of this data and it's gathering and 
> storage without gaps.
> 
> As the remote sensors have little on board storage it is important that a 
> web server is available to receive the data. To that end two separately 
> located servers will be running at all times and updating each other as new 
> data arrives.
> 
> I also assume each server will maintain two copies of the current data file, 
> only one of which would be open at any one time, and some means of 
> indicating if a file write has failed and which file contains valid data. 
> The latter is not so important as the data itself will indicate both its 
> completeness (missing samples) and its newness because of a time stamp with 
> each sample.
> I would wish to secure this data gathering against crashes of the OS, 
> hardware failures and power outages.
> 
> So my request:
> 1. Are there any python modules 'out there' that might help in securely 
> writing such files.
> 2. Can anyone suggest a book or two on this kind of file management. (These 
> kind of problems must have been solved in the financial world many times).
> 
> Many thanks,
> 
> John Pote
> 
> 
Others have made recommendations that I agree with: Use a REAL
database that supports transactions.  Other items you must
consider:

1) Don't spend a lot of time engineering your software and then
purchase the cheapest server you can find.  Most fault tolerance
has to due with dealing with hardware failures.  Eliminate as
many single-point-of-failure devices as possible.  If your
application requires 99.999 uptime, consider clustering.

2) Using RAID arrays, multiple controllers, ECC memory, etc. is
not cheap but then fault tolerance requires such investments.

3) Don't forget that power and Internet access are normally the
final single point of failure.  It doesn't matter about all the
rest if the power is off for an extended period of time.  You
will need to host your server(s) at a hosting facility that has
rock-solid Internet pipes and generator backed power.  It won't
do any good to have a kick-ass server and software that can
handle all types of failures if someone knocking over a power
pole outside your office can take you offline.

Hope info helps.

-Larry Bates
in a hosting facility



More information about the Python-list mailing list