NoSQL Movement?

Jonathan Gardner jgardner at jonathangardner.net
Fri Mar 12 04:05:27 EST 2010


On Wed, Mar 3, 2010 at 2:41 PM, Avid Fan <me at privacy.net> wrote:
> Jonathan Gardner wrote:
>>
>> I see it as a sign of maturity with sufficiently scaled software that
>> they no longer use an SQL database to manage their data. At some point
>> in the project's lifetime, the data is understood well enough that the
>> general nature of the SQL database is unnecessary.
>>
>
> I am really struggling to understand this concept.
>
> Is it the normalised table structure that is in question or the query
> language?
>
> Could you give some sort of example of where SQL would not be the way to go.
>   The only things I can think of a simple flat file databases.

Sorry for the late reply.

Let's say you have an application that does some inserts and updates
and such. Eventually, you are going to run into a limitation with the
number of inserts and updates you can do at once. The typical solution
to this is to shard your database. However, there are other solutions,
such as storing the files in a different kind of database, one which
is less general but more efficient for your particular data.

Let me give you an example. I worked on a system that would load
recipients for email campaigns into a database table. The SQL database
was nice during the initial design and prototype stage because we
could quickly adjust the tables to add or remove columns and try out
different designs.. However, once our system got popular, the
limitation was how fast we could load recipients into the database.
Rather than make our DB bigger or shard the data, we discovered that
storing the recipients outside of the database in flat files was the
precise solution we needed. Each file represented a different email
campaign. The nature of the data was that we didn't need random
access, just serial access. Storing the data this way also meant
sharding the data was almost trivial. Now, we can load a massive
number of recipients in parallel.

You are going to discover certain patterns in how your data is used
and those patterns may not be best served by a generic relational
database. The relational database will definitely help you discover
and even experiment with these patterns, but eventually, you will find
its limits.

-- 
Jonathan Gardner
jgardner at jonathangardner.net



More information about the Python-list mailing list