NoSQL

March 15th, 2012

It’s new, cool (anti-establishment) and the big guys are using it! It’s built for performance, reliability and scalability, and to top it off, it’s free. Why shouldn’t you build your application/business with it?

Unfortunately, before jumping in feet first, I think it is necessary to slow down for a minute and fully understand the advantages of a NoSQL solution along with the tradeoffs you MAY be forced to make.

First, as I stated NoSQL is increasingly being hyped as a next-generation database that fixes all the performance, scalability, and complexity problems that you might encounter when using relational databases. However, while NoSQL delivers these powerful capabilities, it does require a number of very serious compromises that can also be detrimental to a business. To obtain the high-performance and scalability, NoSQL implementations remove what some may consider to be unwanted and unnecessary functionality of the relational database. The problem is that the removal of this functionality may come at a high cost for many normal business requirements.

For example, a key premise of most NoSQL databases is to remove atomicity, consistency, isolation, durability (ACID) in favor of Basically Available data with Soft state that becomes Eventually consistent (BASE). This essentially means that when you ask a question if you wait long enough you will eventually get a complete and accurate answer, but in the (quick) meantime you may get results that are only partially correct. This may be acceptable when related to a 140 character Tweet, but certainly is unacceptable when related to a series of ATM transactions.

Another example relates to aggregates. Some NoSQL implementations offer limited ability to perform SUM(), MAX(), AVG(), or GROUP BY operations. NoSQL implementations are architected to provide highly efficient CRUD operations against objects, documents, or graphs. This makes normal day-to-day operations very scalable for end users of highly specialized applications. But if management suddenly requests the total number of orders placed by customers referred by partners in Colorado to evaluate tax liability, NoSQL may not be able to easily provide the answer.

There are obviously many advantages to using a NoSQL solution, otherwise the big players wouldn’t have spent the time, effort and resources on developing the specialized solutions they felt they needed to solve their problems. However, as someone who is extremely experienced with database design and performance issues I also know that there are many poorly designed databases in applications and conversely numerous ways to improve the performance of those databases. Therefore, I think it is important to look at the big picture and fully understand the applications that NoSQL and SQL databases are best suited for and use each as just one tool in your toolbox. Therefore, when it is time to build your application, you want to reach into that toolbox to grab the best tool for the job, not merely NoSQL because it is new and cool or, for that matter, SQL, because that is what you have always been using. Having said that, you also have to understand your application and that the current flavors of NoSQL databases are designed to solve specialized extremely large data issues and are not a one size fits all solution. Finally, I also know that it is fairly easy to obtain great performance from relational databases and unless you collect massive data like Google and Facebook, a relational database may still be your best choice.