How to find bad row with db api executemany()?

Chris Angelico rosuav at gmail.com
Sat Mar 30 00:21:31 EDT 2013


On Sat, Mar 30, 2013 at 3:10 PM, Roy Smith <roy at panix.com> wrote:
> In article <mailman.3986.1364615879.2939.python-list at python.org>,
>  Chris Angelico <rosuav at gmail.com> wrote:
>
>> Side point: You mentioned SSDs. Are you aware of the fundamental risks
>> associated with them? Only a handful of SSD models are actually
>> trustworthy for databasing.
>
> We haven't decided if we're going that route yet, but if we do, we will
> probably use do RAID SSD for added reliability.  We also have all our
> database servers in failover clusters, so we get added reliability that
> way too.

But will you know if you have corruption? Normally, transactional
integrity means:

1) If a transaction, from begin to commit, is not completely applied,
then it is completely not-applied; and
2) If a transaction that was affected by this one has been applied,
then so has this one.

SSDs that lie about fsync (and some hard disks lie too, as do some
operating systems and some file system drivers, but - under Linux at
least - it's possible to guarantee the OS and FS parts) can violate
both halves. Anything might have been written, anything might have
been missed. I did some tests with PostgreSQL on an SSD, and the
results were seriously scary.

> But, we have some runway left with more conventional technologies, so we
> don't need to decide for a while.  Ultimately, however, as reliability
> goes up and cost comes down, it's hard to imagine the SSD isn't going to
> be a big part of our lives at some point.

Yes. I hope that by then, the manufacturers will realize that TPS
isn't the only thing that matters. I'm sure SSDs will mature to the
point where we can trust all brands equally (or at least, most brands
- maybe it'll be like "server SSDs" and "desktop SSDs"?), but until
then, there aren't many options.

ChrisA



More information about the Python-list mailing list