NoSQL Movement?

floaiza floaiza2 at gmail.com
Sun Mar 7 18:55:14 EST 2010


I don't think there is any doubt about the value of relational
databases, particularly on the Internet. The issue in my mind is how
to leverage all the information that resides in the "deep web" using
strictly the relational database paradigm.

Because that paradigm imposes a tight and rigid coupling between
semantics and syntax when you attempt to efficiently "merge" or
"federate" data from disparate sources you can find yourself spending
a lot of time and money building mappings and maintaining translators.

That's why approaches that try to separate syntax from the semantics
are now becoming so popular, but, again, as others have said, it is
not a matter of replacing one with the other, but of figuring out how
best to exploit what each technology offers.

I base my remarks on some initial explorations I have made on the use
of RDF Triple Stores, which, by the way, use RDBMSs to persist the
triples, but which offer a really high degree of flexibility WRT the
merging and federating of data from different semantic spaces.

The way I hope things will move forward is that eventually it will
become inexpensive and easy to "expose" as RDF triples all the
relevant data that now sits in special-purpose databases.

(just an opinion)

Francisco

On Mar 3, 12:36 pm, Xah Lee <xah... at gmail.com> wrote:
> recently i wrote a blog article on The NoSQL Movement
> athttp://xahlee.org/comp/nosql.html
>
> i'd like to post it somewhere public to solicit opinions, but in the
> 20 min or so, i couldn't find a proper newsgroup, nor private list
> that my somewhat anti-NoSQL Movement article is fitting.
>
> So, i thought i'd post here to solicit some opinins from the programer
> community i know.
>
> Here's the plain text version
>
> -----------------------------
> The NoSQL Movement
>
> Xah Lee, 2010-01-26
>
> In the past few years, there's new fashionable thinking about anti
> relational database, now blessed with a rhyming term: NoSQL.
> Basically, it considers that relational database is outdated, and not
> “horizontally” scalable. I'm quite dubious of these claims.
>
> According to Wikipedia Scalability article, verticle scalability means
> adding more resource to a single node, such as more cpu, memory. (You
> can easily do this by running your db server on a more powerful
> machine.), and “Horizontal scalability” means adding more machines.
> (and indeed, this is not simple with sql databases, but again, it is
> the same situation with any software, not just database. To add more
> machines to run one single software, the software must have some sort
> of grid computing infrastructure built-in. This is not a problem of
> the software per se, it is just the way things are. It is not a
> problem of databases.)
>
> I'm quite old fashioned when it comes to computer technology. In order
> to convience me of some revolutionary new-fangled technology, i must
> see improvement based on math foundation. I am a expert of SQL, and
> believe that relational database is pretty much the gist of database
> with respect to math. Sure, a tight definition of relations of your
> data may not be necessary for many applications that simply just need
> store and retrieve and modify data without much concern about the
> relations of them. But still, that's what relational database
> technology do too. You just don't worry about normalizing when you
> design your table schema.
>
> The NoSQL movement is really about scaling movement, about adding more
> machines, about some so-called “cloud computing” and services with
> simple interfaces. (like so many fashionable movements in the
> computing industry, often they are not well defined.) It is not really
> about anti relation designs in your data. It's more about adding
> features for practical need such as providing easy-to-user APIs (so
> you users don't have to know SQL or Schemas), ability to add more
> nodes, provide commercial interface services to your database, provide
> parallel systems that access your data. Of course, these needs are all
> done by any big old relational database companies such as Oracle over
> the years as they constantly adopt the changing industry's needs and
> cheaper computing power. If you need any relations in your data, you
> can't escape relational database model. That is just the cold truth of
> math.
>
> Importat data, such as used in the bank transactions, has relations.
> You have to have tight relational definitions and assurance of data
> integrity.
>
> Here's a second hand quote from Microsoft's Technical Fellow David
> Campbell. Source
>
>     I've been doing this database stuff for over 20 years and I
>     remember hearing that the object databases were going to wipe out
>     the SQL databases. And then a little less than 10 years ago the
>     XML databases were going to wipe out.... We actually ... you
>     know... people inside Microsoft, [have said] 'let's stop working
>     on SQL Server, let's go build a native XML store because in five
>     years it's all going....'
>
> LOL. That's exactly my thought.
>
> Though, i'd have to have some hands on experience with one of those
> new database services to see what it's all about.
>
> --------------------
> Amazon S3 and Dynamo
>
> Look at Structured storage. That seems to be what these nosql
> databases are. Most are just a key-value pair structure, or just
> storage of documents with no relations. I don't see how this differ
> from a sql database using one single table as schema.
>
> Amazon's Amazon S3 is another storage service, which uses Amazon's
> Dynamo (storage system), indicated by Wikipedia to be one of those
> NoSQL db. Looking at the S3 and Dynamo articles, it appears the db is
> just a Distributed hash table system, with added http access
> interface. So, basically, little or no relations. Again, i don't see
> how this is different from, say, MySQL with one single table of 2
> columns, added with distributed infrastructure. (distributed database
> is often a integrated feature of commercial dbs, e.g. Wikipedia Oracle
> database article cites Oracle Real Application Clusters )
>
> Here's a interesting quote on S3:
>
>     Bucket names and keys are chosen so that objects are addressable
>     using HTTP URLs:
>
>         *http://s3.amazonaws.com/bucket/key
>         *http://bucket.s3.amazonaws.com/key
>         *http://bucket/key(where bucket is a DNS CNAME record
> pointing to bucket.s3.amazonaws.com)
>
>     Because objects are accessible by unmodified HTTP clients, S3 can
>     be used to replace significant existing (static) web hosting
>     infrastructure.
>
> So this means, for example, i can store all my images in S3, and in my
> html document, the inline images are just normal img tags with normal
> urls. This applies to any other type of file, pdf, audio, but html
> too. So, S3 becomes the web host server as well as the file system.
>
> Here's Amazon's instruction on how to use it as image server. Seems
> quite simple: How to use Amazon S3 for hosting web pages and media
> files? Source
>
> --------------------
> Google BigTable
>
> Another is Google's BigTable. I can't make much comment. To make a
> sensible comment, one must have some experience of actually
> implementing a database. For example, a file system is a sort of
> database. If i created a scheme that allows me to access my data as
> files in NTFS that are distributed over hundreds of PC, communicated
> thru http running Apache. This will let me access my files. To insert,
> delete, data, one can have cgi scripts on each machine. Would this be
> considered as a new fantastic NoNoSQL?
>
> ---------------------
>
> comments can also be posted tohttp://xahlee.blogspot.com/2010/01/nosql-movement.html
>
> Thanks.
>
>   Xah
>http://xahlee.org/
>
>



More information about the Python-list mailing list