100 GB to 30,000 GB: Size, Speed, and Benchmarks
The TPC gained its early fame publishing online transaction processing (OLTP) benchmarks, such as TPC-A and TPC-B. TPC benchmarks are defined to model actual business scenarios instead of being a suite of tests that isolate communications, disk I/Os, backup, and other factors that affect database application performance. TPC defines the specific benchmark functions but organizations are free to choose the mix of hardware and software to use in their implementation of a benchmark. It certifies auditors and requires a full disclosure report describing the hardware and software used for a benchmark.
When TPC-C was approved in 1992, it was a replacement for the older DebitCredit benchmarks. TPC-C is an OLTP benchmark that uses a more complex database and transaction types than its predecessors. It models an order processing application for a parts supplier with multiple sales districts and warehouses. The application processes incoming orders and maintains a stocking level of 100,000 items at the warehouses, while using a database with ACID properties (atomicity, consistency, isolation, durability).
TPC-C results show dramatic growth over the past decade in the number of transactions processed per second and a dramatic reduction in the price per transaction.
TPC-E is a more recent OLTP benchmark that's built around the model of a brokerage firm. Since 2010, vendors have submitted results for ten TPC-E benchmarks. The TPC provides a software tool named EGen to assist in TPC-E benchmarking. It provides project and make files and the capability of generating the data required for the benchmark.
A high volume of short-lived queries and a large number of concurrent users typically characterize large-scale OLTP applications. But decision support is a different kind of beast, often involving very large databases, sometimes fed by Big Data.
In recent years the TPC has recognized that an important class of database users runs analytics and business intelligence applications. Those users want optimal performance for decision support. Unlike an OLTP workload, decision support involves longer running queries, often against data warehouses, for a smaller number of users.
TPC responded to this community by developing benchmarks that model decision-support scenarios instead of processing transactions. The TPC-H benchmark runs a collection of ad hoc queries while concurrently performing updates to the data. TPC provides a tool, DBGEN, for generating the databases for TPC-H benchmarks. The published results for TPC-H are broken down by database size (100 GB, 300GB, 1000GB, 3000 GB, 10,000 GB, and 30,000 GB).
Recently, I received a briefing from two TPC representatives about an upcoming conference and a new decision support benchmark. Interest in benchmarking is a prime reason for a TPC conference. The TPC Technology Conference on Performance Evaluation and Benchmarking (TPC-TC) is an annual event. Raghunath Nambiar (Cisco) and Meikel Poess (Oracle) are the General Chairs of the 2012 conference that runs 27-31 August in Istanbul, Turkey. TPC-TC 2012 features a keynote by Dr. Michael Carey (UC Irvine), a well-known database researcher.
The new TPC-DS benchmark is a major improvement over TPC-H, which has been the industry's flagship decision support benchmark for a decade. Meikel Poess provided drill-down into details about TPC-DS. Because he worked on both TPC-H and TPC-DS, Meikel was able to highlight useful comparisons between the two.
We'll take a closer look at TPC-DS in my next blog post.