How to find bad row with db api executemany()?

Fri Mar 29 23:36:54 EDT 2013

In article <mailman.3981.1364613280.2939.python-list at python.org>,
 Chris Angelico <rosuav at gmail.com> wrote:

> Hmm. I heard around the forums that Amazon weren't that great at disk
> bandwidth anyway, and that provisioning IO was often a waste of money.

Au, contraire.  I guess it all depends on what you're doing.  If you're 
CPU bound, increasing your I/O bandwidth won't help.  But, at least on 
our database (MongoDB) servers, we saw a huge performance boost when we 
started going for provisioned IO.

> But we never did all that much much research on Amazon I/O
> performance; shortly after doing some basic benchmarking, we decided
> that the cloud was a poor fit for our system model, and went looking
> at dedicated servers with their own RAID storage right there on the
> bus.

As far as I can tell, from a raw price/performance basis, they're pretty 
expensive.  But, from a convenience standpoint, it's hard to beat.

Case in point: We've been thinking about SSD as our next performance 
step-up.  One day, we just spun up some big honking machine, configured 
it with 2 TB of SSD, and played around for a while.  Wicked fast.  Then 
we shut it down.  That experiment probably cost us $10 or so, and we 
were able to run it on the spur of the moment.

Another example was last summer when we had a huge traffic spike because 
of a new product release.  Caught us by surprise how much new traffic it 
would generate.  Our site was in total meltdown.  We were able to spin 
up 10 new servers in an afternoon.  If we had to go out and buy 
hardware, have it shipped to us, figure out where we had rack space, 
power, network capacity, cooling, etc, we'd have been out of business 
before we got back on the air.

Yet another example.  We just (as in, while I've been typing this) had 
one of our servers go down.  Looks like the underlying hardware the VM 
was running on croaked, because when the instance came back up, it had a 
new IP address.  The whole event was over in a couple of minutes, with 
only minor disruption to the service.  And, presumably, there's some 
piece of hardware somewhere in Virginia that needs repairing, but that's 
not our problem.

The really big boys (Google, Facebook) run their own data centers.  But, 
some surprisingly large operations run out of AWS.  Netflix, for 
example.  The convenience and flexibility is worth a lot.