Sharding, Replication, Caches, and In-Memory Databases
Caching, replication, and sharding have proven to be important tools in the modern database architect's toolbox. To meet the requirements of high-throughput websites, system architects are turning to distributed caches and in-memory data grids to reduce I/O operations against disk-based databases.
Developers looking for a cache solution for SQL databases often turn to memcached, an in-memory key-value store. It supports writing cache code using a variety of languages, including Java, C++, PHP, Ruby, Erlang, and Python.
Another approach to caching is to stay with the SQL paradigm for both cache and database operations. This is possible by using an in-memory SQL database as a cache for on-disk databases. One such in-memory database, H-Store, was shown to deliver excellent performance when properly partitioned. It morphed into a commercial product known as VoltDB. Other products that support main memory database operations include Oracle TimesTen, McObject eXtremeDB, and HyperSQL Database (HSQLDB). Besides acting as a cache for large on-disk databases, the in-memory databases are often embedded in applications, such as low-latency trading systems.
Going back to the analogy from my previous post, the flood control system for the Mississippi River often relied on distributaries to relieve the pressure on levees by distributing the flow of water. For very large database (VLDB) architectures, horizontal partitioning can provide a similar volume-reduction distribution function — in this case, the distribution of data so that multiple servers process queries.
Sharding, or horizontal partitioning of data, is a proven solution for web-scale databases, such as those in use by social networking sites. One such site is Netlog, a social network that has 100 million friendships, 50 million unique visitors per month, and 5 billion page views per month. Netlog runs MySQL, with the typical database getting 3000+ queries per second. The data distribution architecture evolved through several partitioning schemes based on master-slave replication before settling on a sharding scheme based on user ID. The application has been designed to avoid expensive, cross-shard queries.
Manual methods are often used in the layout of horizontal partitions, but recent research and tools can assist the database designer in laying out shards. Hibernate is an open-source, object-relational mapping (ORM) tool for Java and . NET developers. It's a popular solution for providing a persistence framework for Java developers, and the Hibernate project has taken on the challenge of database partitioning. Hibernate Shards enables developers to work with data partitions while using the Hibernate Core API . To meet the re-sharding challenge, it supports virtual shards.
SQLAlchemy is a database toolkit for Python that enables a session to distribute queries across multiple databases. Besides providing object-relational mapping (ORM) support, SQLAlchemy also provides horizontal partitioning with
MIT is the home of the RelationalCloud project, which focuses on Database-as-a-Service (DaaS). Cloud computing is characterized by dynamic workloads and a requirement for scalability and elastic computing, which will challenge any rigid data partitioning scheme. At the VLDB 2010 conference, a paper from MIT researchers presented Schism, a scheme for letting workloads drive replication and partitioning. The purpose of Schism partitioning is twofold; first, to balance partitions and second, to minimize distributed transactions. Researchers tested Schism with a variety of workloads. These included online transaction processing (OLTP) and complex social networks with multiple n-to-n relationships. The Schism researchers created a graph that represented the database and workload, and then used the METIS algorithm to partition the graph.
But sharding alone does not guarantee that database scalability problems won't cause a site to fall over. An outage at Foursquare.com serves as a reminder to constantly monitor workloads and distribution of data — and to rebalance when necessary.
Click here to read Part 1 of this post.