Which non SQL Database ?

Deadly Dirk dirk at pfln.invalid
Sun Jan 23 01:15:38 EST 2011


On Sat, 04 Dec 2010 16:42:36 -0600, Jorge Biquez wrote:

> Hello all.
> 
> Newbie question. Sorry.
> 
> As part of my process to learn python I am working on two personal
> applications. Both will do it fine with a simple structure of data
> stored in files. I now there are lot of databases around I can use but I
> would like to know yoor advice on what other options you would consider
> for the job (it is training so no pressure on performance). One
> application will run as a desktop one,under Windows, Linux, Macintosh,
> being able to update data, not much, not complex, not many records. The
> second application, running behind  web pages, will do the same, I mean,
> process simple data, updating showing data. not much info, not complex.
> As an excersice it is more than enough I guess and will let me learn
> what I need for now. Talking with a friend about what he will do (he use
> C only) he suggest to take a look on dBase format file since it is a
> stable format, fast and the index structure will be fine or maybe go
> with BD (Berkley) database file format (I hope I understood this one
> correctly) . Plain files it is not an option since I would like to have
> option to do rapid searches.
> 
> What would do you suggest to take a look? If possible available under
> the 3 plattforms.
> 
> Thanks in advance for your comments.
> 
> Jorge Biquez

Well, two NoSQL databases that I have some experience with are MongoDB 
and CouchDB. The choice among them depends on your application. CouchDB 
is an extremely simple to set up, it is all about the web interface, as a 
matter of fact it communicates with the outside world using HTTP 
protocol, returning JSON objects. You can configure it using curl. It is 
also extremely fast but it doesn't allow you to run ad hoc queries. You 
have to create something called a "view". This is more akin to what 
people in the RDBMS world call a "materialized view". Views are created 
by running JavaScript function on every document in the database. Results 
are stored in B*Tree index and then modified as documents are being 
inserted, updated or deleted. It is completely schema free, there are no 
tables, collections or "shards". The primary language for programming 
Couch is JavaScript.
The same thing applies to MongoDB which is equally fast but does allow ad 
hoc queries and has quite a few options how to do them. It allows you to 
do the same kind of querying as RDBMS software, with the exception of 
joins. No joins. It also allows map/reduce queries using JavaScript and 
is not completely schema free. Databases have sub-objects called 
"collections" which can be indexed or partitioned across several machines 
("sharding"), which is an excellent thing for building shared-nothing 
clusters. Collections can be indexed and can be aggregated using 
JavaScript and Google's map/reduce. Scripting languages like Python are 
very well supported and linked against MongoDB, which tends to be faster 
then communicating using HTTP. I find MongoDB well suited for what is 
traditionally known as data warehousing.
Of course, traditional RDBMS specimens like MySQL, PostgreSQL, Firebird, 
Oracle, MS SQL Server or DB2 still rule supreme and most of the MVC tools 
like Django or Turbo Gears are made for RDBMS schemas and can read things 
like the primary or foreign keys and include that into the application.
In short, there is no universal answer to your question. If  prices are a 
consideration, Couch, Mongo, MySQL, PostgreSQL, Firebird and SQL Lite 3 
all cost about the same: $0. You will have to learn significantly less 
for starting with a NoSQL database, but if you need to create a serious 
application fast, RDBMS is still the right answer. You may want to look 
at this Youtube clip entitled "MongoDB is web scale":

http://www.youtube.com/watch?v=b2F-DItXtZs



-- 
I don't think, therefore I am not.



More information about the Python-list mailing list