Google developed Bigtable to distribute data across thousands of servers and scale to petabyte-sized data sets. A variety of Google applications use it, including Web indexing, Google Earth, Google Maps, Blogger, YouTube, and Gmail. The YouTube collection of 100 million videos requires 600 TB of storage. Bigtable is proprietary, but the data model exists in open source implementations, including Hypertable, Cassandra, and HBase. Bigtable can be used as input or output to MapReduce, which enables distributed processing of files or databases using mapping and reduction functions.
Dynamo was created to provide a key-value data store designed for high availability, permitting updates to survive server failures and network outages. Amazon subsequently built SimpleDB as a key-value store available for Amazon Web Services customers. SimpleDB, which is in beta, is restricted to items having no more than 256 attribute name-value pairs, domains having no more than 10 GB, and databases that are no more than 1 TB. Amazon reports consistency is usually attained across all copies of the data within a second. SimpleDB uses a SQL-like query language.
Project Voldemort, an open source clone of Amazon Dynamo, is a keyvalue data store that supports versioning, eventual consistency (where the database sometimes returns the wrong answer in order to maintain scaling), and automatic partitioning and replication. Keys and values can be complex objects such as maps or lists. Project Voldemort supports offline building of distributed data stores. LinkedIn developers created it, and sites such as Lookery have used it.
Cassandra integrates the Bigtable data model with the Dynamo distributed design. It offers eventual consistency, not the rigid consistency that ecommerce transactions and stock trading require. Instead of data stored in row-major or column-major sequence, Cassandra uses the ColumnFamily order inspired by Bigtable.
Cassandra is geographically distributed across multiple data centers, such as Amazon EC2 availability zones. Bulk loading can be done with Hadoop.
Economics of Scaling
SimpleGeo, a geographic data provider, is using Apache Cassandra, an open source NoSQL offering, to avoid DBMS licensing costs as part of an effort to scale out to a multiserver database architecture.
It's running a 50-node cluster, which spans three data centers on Amazon’s EC2 service for about $10,000 a month, says CTO Joe Stump, who previously used Cassandra at Digg.By contrast,MySQL premium support would cost about $5,000 per year per node, or $250,000 per year -- more than double the Cassandra setup, Stump says, and Microsoft SQL Server can cost as much as $55,000 per processor per year.
The $10,000 is an operational expense as opposed to a capital expense, and that’s "a bit nicer on the books," he says. p>
Cassandra provides availability and scalability for a number of well-known sites, including the huge Twitter and Facebook user communities. When Twitter's user numbers took off, it migrated from MySQL to a combination of MySQL/memcached plus 45 nodes running Cassandra. That mixed environment now handles 50 million tweets per day. Facebook adds about 60 million photos per week using Cassandra. For Digg, Cassandra manages about 3 TB of data.
Digg for one made a highly publicized move from MySQL to Cassandra. Digg's primary motivation for using Cassandra was "the increasing difficulty of building a high-performance, write-intensive application on a data set that is growing quickly, with no end in sight," says John Quinn, Digg's VP of engineering. Growth forced Digg into horizontal and vertical partitioning strategies that eliminated most of the value of a relational database, while still incurring all the overhead, Quinn says.
"Our system grows rapidly and requires us to provide performance and redundancy with multiple data centers and to add capacity or replace failed nodes with no downtime," he adds. As for data consistency, Digg's engineers can implement application-level controls much more efficiently with Cassandra than MySQL does generically, Quinn says.
Tokyo Tyrant is an open source database server, with a companion full-text search engine, that has a following in the NoSQL community. It's a key-value database with a hash and b-tree index structure, capable of inserting 1 million records at 0.4 seconds per record and executing 58,000 queries per second. It supports asynchronous replication and transaction processing with ACID properties and write-ahead logging. There are bindings for multiple programming languages, including Perl, Java, Ruby, and PHP. Production deployments include Scribd and Mixi, the Japanese equivalent of Facebook. LightCloud turns Tokyo Tyrant into a horizontally scalable distributed database with the addition of a universal hashing layer. Social journal Plurk uses the LightCloud Tokyo Tyrant offering.