Is it more CPU-efficient to read/write config file or read/write sqlite database?

dick encore1 at cox.net
Wed Dec 18 12:49:52 EST 2013


On Wed, 18 Dec 2013 21:50:00 +1100, Chris Angelico <rosuav at gmail.com>
wrote:

>On Wed, Dec 18, 2013 at 9:31 PM, Cameron Simpson <cs at zip.com.au> wrote:
>> On 18Dec2013 14:35, Chris Angelico <rosuav at gmail.com> wrote:
>>> An SQL database *is* a different form of storage. It's storing tabular
>>> data, not a stream of bytes in a file. You're supposed to be able to
>>> treat it as an efficient way to locate a particular tuple based on a
>>> set of rules, not a different way to format a file on the disk.
>>
>> Shrug. It's all just data to me. I don't _care_ about the particular
>> internal storage format.
>
>Then use a file, because you want file semantics. That's why you have
>both options available.
>
>> Commit() is a logical operation saying this SQL changeset is now
>> part of the global state.
>
>The global state is defined by what's on the disk. Specifically, by
>what would be read if the power failed right at that moment. In the
>case of PostgreSQL, a commit doesn't actually write the table pages -
>it just writes the WAL (Write-Ahead Log), which is used to recreate
>the transaction. If something fails hard, the WAL replay will apply
>the change perfectly. That's the global state. It's not there till the
>WAL's been fsync'd.
>
  <snip>
>
>> All that has happened after an fsync() is that the OS taken your
>> SQL changeset that you commited to the OS data abstraction and
>> pushed it one layer lower into the "disk" abstraction. There's more
>> going on in there.
>
>Not just pushed it one layer lower; the point of fsync is that it's
>been pushed all the way down. See its man page [1]:
>
>"""fsync() transfers ("flushes") all modified in-core data ... to the
>disk ... so that all changed information can be retrieved even after
>the system crashed or was rebooted."""
>
>It's fundamentally about crash recovery, not about "passing it to a
>lower abstraction". Of course, the OS isn't always *able* to guarantee
>things (NFS shares are notoriously hard to pin down), but the
>intention of fsync is that it won't return (and therefore the COMMIT
>operation won't finish) until the data can be read back reliably even
>in the event of a major failure.
>
>Databases protect against that. If you want that protection, use a
>database. If you don't, use a file. There's nothing wrong with either
>option.
>
>ChrisA
>
>[1] on the web here, for those who don't have them handy:
>http://linux.die.net/man/2/fsync

Don't forget that most hard disks have an option to cache the write
data. This is a 'feature' that allows the manufacturers to claim
better write performance. You can't be sure when the data is written
to the disk if that option is in play.

Dick



More information about the Python-list mailing list