NoSQL Speed Bumps
With cloud computing gaining momentum, there is continued interest in suitable databases for the cloud. Initially, advocates of the NoSQL movement touted the ability to scale out for large websites, but not all of the recent press about NoSQL databases has been positive.
The case for adopting a key-value store or other NoSQL technology was advanced by the conflict between scalability, high-availability, and consistency (the C in ACID properties). Adequate response time for Internet applications with hundreds of millions of users requires scaling out (horizontal scaling) and partitioning data (sharding).
Proponents of NoSQL technology point to Brewer's CAP Theorem and claim availability (the A in CAP) trumps consistency. They argue the consistency benefit offered by SQL databases is offset by the work needed to make them scale horizontally, ergo the need for new data stores with a relaxed consistency model (eventual consistency).
Now that we've traveled down this road for a while, what have we learned?
Organizations using the new NoSQL data stores are like the settlers who often traveled uncharted territory in the 19th-century American West. Some days involve the beauty of the journey; but on other days, you're dodging arrows.
When a popular website has an outage, it's newsworthy. The failure is sometimes attributable to "back end servers," which often means database servers. Failure of a database server is not always due to a bug in the code. Some failures are due to the fact that some websites are pushing the envelope on the amount of data they must handle, particularly with social networks that involve n-to-n relationships for each member. We don't have 25 years of academic research and best practices information about scaling to support hundreds of millions of online users. So the root cause of a NoSQL database failure might be traceable to flaws in architecture, inadequate monitoring, faulty software or hardware upgrades, security holes, and other factors besides buggy code. Substitute 'NoSQL' for 'SQL' and the previous sentence is still true, but we have accumulated years of "best practices" information about SQL.
The Cassandra NoSQL data store is now an Apache project, but it originally gained fame because Facebook was its incubator. Facebook is the most popular social networking site, and it must scale to serve hundreds of millions of users. To do so, it is not a pure NoSQL shop. Facebook uses both Cassandra and MySQL (over InnoDB), with Cassandra providing Inbox searches.
The MySQL database configuration processes 13 million queries per second. Read performance has reached a peak of 450 million rows per second. Facebook has thousands of MySQL servers and databases with up to 20 shards each.
Because Facebook is the poster boy for scalability, its outages do not go unnoticed. At various times since 2007, there have been reports of outages, hacking incidents, or both.
In 2009, a database outage cut off about 150,000 Facebook users for two weeks. Facebook said it was "a technical issue with a single database." On September 23, 2010, Facebook had an outage attributed to a race condition when a monitoring system flooded the database with queries:
"The intent of the automated system is to check for configuration values that are invalid in the cache and replace them with updated values from the persistent store. This works well for a transient problem with the cache, but it doesn't work when the persistent store is invalid. … The way to stop the feedback cycle was quite painful -- we had to stop all traffic to this database cluster, which meant turning off the site."
Facebook looked at database solutions for a messaging system that integrated chat, SMS, and e-mail. It was expected to handle in excess of 120 billion messages per month. Although the Facebook message content was being stored with MySQL, Facebook evaluated MySQL, Cassandra, and HBase as the data store for the new messaging system. HBase supports a strong consistency model. After Facebook selected HBase, the reason for not choosing Cassandra was explained as:
"We found Cassandra's eventual consistency model to be a difficult pattern to reconcile for our new Messages infrastructure."
Cassandra also landed in the spotlight when an upgrade to Digg.com produced outage problems and eventually a management change. Digg fell over after a migration from MySQL databases to Apache Cassandra; undoubtedly in hopes of sustaining anticipated growth.
The switch to Cassandra at Digg brought scalability and availability problems and heads rolled. When Digg 4.0 had problems, NoSQL proponents claimed the problems were not related to Cassandra. However, Digg founder Kevin Rose spoke on Diggnation and pointed to bugs in Cassandra, problems with the Cassandra architecture, and data corruption problems. More recently, the new CEO of Digg, Matt Williams, has said no data was lost during the failure.
Twitter supports more than 1 billion queries per day (12,000 queries and 1,000 tweets per second). It announced in July 2010 that it was changing direction with respect to Cassandra. Twitter adopted Cassandra for analytics, but abandoned plans to migrate from MySQL as the data store for Tweets. Growth motivated Twitter to change the data store for its search engine, migrating from MySQL to Apache Lucene. On July 19, 2010, Twitter had an outage and problems persisted on July 20. The site fell over due to problems with the user table in an SQL database, which has 125 million rows. A long-running query locked the table for hours. (One wonders what type of database partitioning and caching scheme was in play). Twitter was eventually able to get back on the air because it had been replicating to a standby database.
Another episode of pioneers hit by arrows was Foursquare's problem with MongoDB. Foursquare is a site that caters to mobile users who want location-based information. In October 2010, Foursquare had an 11-hour outage that was followed by a second day of down time. Scaling problems emerged due to the sharding scheme that was used to partition 132 gigabytes based on user name. MondoDB is a product of 10Gen, whose cofounder Eliot Horowitz posted a detailed account of the reasons for the outage.
Simply put, an inordinate amount of traffic to one shard caused the database to fail and bring down the service. The problem occurred in part due to a failure to monitor activity that was putting one of the shards near capacity.
Following the outage, some post-mortems were of the "what if" variety:
- If Foursquare had been using Cassandra and its range partitioning, they might have had similar problems.
- There would not have been a problem with HBase due to the way its node data redistribution works.
There's a certain irony in the critics' claim that SQL platforms are not scalable. Operating on a 132 GB database is not beyond the capability of popular open source and commercial SQL DBMS products.
As the NoSQL and cloud computing trends converge, it will be interesting to see what database technology emerges as the clear winner for developing the next generation of applications. Economics is one issue that's forcing a move to the cloud. This could produce a scenario where organizations are deploying more applications to fit within the scope of a multi-tenant architecture. Multi-tenancy can produce database design constraints that are not congruent with the feature set of current NoSQL implementations (authentication, authorization, load balancing, sharding, and schema separation).