Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Database Tuning: Principles and Surprises


APR93: DATABASE TUNING: PRINCIPLES AND SURPRISES

Dennis is an associate professor at NYU's Courant Institute, where he does research on transaction processing, real-time algorithms, and pattern matching. He also consults at UNIX System Laboratories. He can be reached at [email protected].


Database tuning is the activity of making a database system run faster. The tuner may have to change the way applications are constructed, select new indexes, tamper with the operating system, or buy hardware. Understanding how to do this requires a broad knowledge of the interaction among different components of a database management system (DBMS).

This article attempts to lay a principled foundation for tuning and then illustrates some surprising interactions. My approach to tuning comes from a number of sources: my own experience as a member of a team that designed and implemented an embedded DBMS for AT&T Bell Labs, my DBMS consulting experience for Wall Street firms, and finally from tapping the expertise of tuning consultants affiliated with DBMS vendors such as Oracle, IBM, Sybase, Ingres, Servio, and O2 Technology.

Toward a Rational Approach to Tuning

If you consult the commercial DBMS product manuals for tips on tuning, you will probably get some useful advice, but that advice is presented as disjointed rules of thumb--like so many fortune cookies. This jumble of system-specific facts is hard to manage and remember, in the same way that it's hard to recall the last ten fortune-cookie messages you've read.

My tuning strategy rests on a few common principles that, while they may lack the generative force of mathematical axioms, can put order among the rules of thumb. These principles have the added benefit of explaining interactions among various levels of the system--the connection between index choices, concurrent contention, and buffer management. When reduced to sound bites, the following principles may sound obvious. Even so, they are often ignored. It's therefore worth your while to make a mental checklist and explicitly consider each tuning maxim:

  • Think globally, fix locally.
  • Partitioning cures bottlenecks.
  • Starting is expensive, continuing is cheap.
  • Render unto the server what is due unto the server.

Think Globally, Fix Locally

Say you're presented with a slow system, and you check the running time of each compiled query and discover that one of them is slow. Should you create indexes or take other action to make the query run faster? Well, maybe. First, you should check the accounting statistics to make sure the query runs often enough to be important in the global scheme of things. This seems elementary, but many people waste a lot of time tuning infrequently executed queries and then wonder why their efforts don't bear fruit.

As a second example, suppose you discover that all your disks are saturated. Should you buy a new one to reduce the load? Again, the answer is maybe. Many cheaper alternatives may work just as well. For example, a frequently executed query that performs an equality selection may be scanning instead of searching through an index. Or perhaps a query that performs a SELECT * can be written to select just the attributes it needs and can thereby be answered completely within some dense index. Thus, a local fix to a single query may reduce global expenses.

Partitioning Cures Bottlenecks

A bottleneck occurs when too much work is thrown at too few resources. Partitioning means spreading the work among more resources. Surprisingly, this may or may not entail the replication of physical resources.

For example, suppose a bank has many branches and overwhelms the resources of a mainframe cluster. Physical partitioning may help in this case, since most account activity occurs at the home branch of each depositor. So, creating a computing resource for each branch and partitioning the accounts so each depositor's account information is placed on his or her home branch's computing site may eliminate the bottleneck.

Now, consider a situation in which a long batch update transaction occurs concurrently with many short online transactions, causing lock and resource contention. In this case, you should ask whether the batch transaction can run at a time when there is little online transaction activity. Such temporal partitioning requires no additional physical resources--only a more even use of existing resources.

Finally, suppose you discover excessive lock contention on the free lists of your database buffer. Free lists are managed as follows: A transaction thread dequeues a page from a free list after acquiring the semaphore on that free list. Each of the several free lists has its own semaphore. Lock contention on the free lists results from lock contention on those semaphores. A good approach is to increase the number of free lists in order to increase the number of semaphores and thereby decrease the amount of lock contention. This form of partitioning is known as logical partitioning, because the only resources replicated are locks, a purely soft resource.

Starting is Expensive, Continuing is Cheap

It takes about as long to read or write a track of disk as to read or write a small part of a track. Similarly, it takes little more time to send one kilobyte across a network than it does to send one byte. The general lesson is that setup costs often dominate running costs.

Consider an application that inherently requires scans of files--for example, it performs range queries on nonclustered attributes. The scan will take a lot less time if each read retrieves a large portion of a disk track rather than a single page. This requires that the file be laid out sequentially on disk and that the prefetching factor be set to fetch eight or more pages. The idea is to amortize the setup costs over several accessed pages.

Render Unto the Server What is Due Unto the Server

Suppose that an important application must take some action every time data is inserted into a table. One approach is to check this table periodically from the client to see whether new tuples have been inserted since the last time the client looked.

This polling approach has two complementary problems. If the client polls too frequently, then it may incur useless overhead by issuing polling queries even when no insertions have taken place. Conversely, if the client polls too seldom, it may miss tuples that have been inserted and then subsequently deleted before the next polling query takes place.

A better approach is to embed a program in the server that will execute exactly when tuples have been inserted. This trigger-based approach (triggers are analogous to hardware interrupts) will neither incur unnecessary overhead nor miss insertions. Not all DBMS packages offer triggers, but more and more do all the time.

As a second example, consider a circuit-design application which must traverse the circuit-graph structure many times, to check for power consumption, current noise, testability, and so on. Accessing the necessary pages from the client site across the network to the server site may be hopelessly inefficient. Some object-oriented database management systems give users the option of placing database buffers on the client site, thereby reducing the number of network messages.

The rule of thumb that follows from this principle is that user interaction and compute-intensive tasks should occur at the client, and data-dependent tasks should occur at the server.

Pitfalls for the Unwary

A common approach to teaching database internals is to teach query processing as if the system were single-user and then to teach concurrency control and recovery as a nearly independent topic. This approach works well, since query processing and index selection do not commonly take concurrent activity into account. In fact, it is common for one part of the development team to implement query processing and for a different part to implement concurrency control and recovery. This separation of concerns may be good pedagogy, perhaps even good software engineering, but it is bad tuning policy. The following sections present two extended examples that illustrate the kinds of interactions that can take place at different levels of a system.

The Case of Contending Locks

You have discovered that concurrent inserts into a table encounter severe lock contention. In fact, the insert transactions appear to occur serially, one after another. Your system uses page-level locking, and you have no clustering indexes, but several nonclustering ones.

The diagnosis: The absence of any clustering index implies that all new insertions will go to the last page of the file of tuples. This means that any insert transaction will have to hold a lock on the last page of the file until the transaction ends.

A necessary condition for avoiding a bottleneck is to distribute the inserts among the data tuples. Recall that a clustering index based on a B-tree on an attribute (or attributes) X will impose an organization on the data tuples, so all tuples having values near X will be near one another.

How can this help? Provided that most concurrent inserts won't have near-X values, a clustering index will disperse the inserts across the data file, thereby eliminating the lock-contention bottleneck.

But there can still be a problem if a sequential key is being used. A sequential key is one whose value is proportional to the time of insertion, a timestamp. If X is a sequential key, then newly inserted tuples will have the largest X values of any tuples in the data table. So, there will still be a lock-contention bottleneck on the last data page.

One solution is to choose a non-sequential key on which to cluster. That will eliminate the lock contention on the data pages. If you have a nonclustering index based on a B-tree on a sequential key, however, then there will be lock contention on the last page of the B-tree whenever there are many concurrent inserts. If the data structure were a hash structure, however, then nearby but unequal X values would be widely dispersed in both the data structure and the data table. So, hashing is a good strategy if you must index sequential keys.

Thus, key type, insert frequency, and data-structure type interact in ways that may seem obvious only in retrospect. If many inserts occur on your table, then either form a clustering index on a non-sequential key using a B-tree data structure, or form a clustering index based on a hash structure. If you want a clustering or nonclustering index on a sequential key, then consider a hashing data structure, if one is available.

The Case of Slow Accesses

Our second example has to do with a bank that keeps track of its depositors using the relation:

  account(id, balance, name, street, city, zip)

There are essentially two types of accesses: accesses to the balance attribute, which result from deposits, withdrawals, and queries; and accesses to the entire relation in order to send out monthly statements.

Accesses of the first type may occur up to several times a day and are, on the average, much more frequent than accesses of the second kind. This suggests a vertically partitioned design of the form:

  accountbal(id, balance)   accountrest(id, name, street, city, zip)

This makes the second style of query more expensive because it requires a join. The first style, however, may become more efficient for two reasons:

  • The original account tuples are much larger than the accountbal tuples, by a factor of 4 to 10. Thus, the likelihood of finding a random accountbal tuple in the database buffer is likely to be at least four times as large as the likelihood of finding an account tuple in the buffer. If both likelihoods are very small, then this will not help much, but if 20 percent of the account tuples would have been found in the buffer, then approximately 80 percent of the accountbal tuples will now be in the buffer (and perhaps more, because of the vagaries of the least-recently used algorithm used to manage the buffer).
  • A sparse clustering index is a data structure having one pointer per page, as opposed to one pointer per record. Thus, a sparse clustering index on a table with small records will have far fewer pointers than one with larger records. This may mean that the sparse index will be one level shallower in accountbal, thus saving a disk read on each record access.
Thus, the benefits of vertical partitioning depend critically on the relative access frequencies to the different partitions and on the relative tuple sizes. The benefits increase significantly if your system offers sparse indexes and if your buffer is large.

Troubleshooting Lessons

Database tuning is based on a few principles and a body of knowledge. Some of that knowledge depends on the specifics of particular DBMS packages, such as which index types each system offers. But in actual practice, much tuning is independent of vendor, version number, and even data model (for example, hierarchical, relational, or object oriented).

When troubleshooting a system, the tuner must decide whether to attack global problems first, such as concurrent contention or slow logging, or to dispense with local ones, such as tuning a certain important query. Either way, you will eventually look at lock statistics, disk load, operating-system priorities, relative query frequencies, and query plans. Before deciding on a course of treatment, you must formulate a correct diagnosis. Even without a good bedside manner, you can use the principles here to get a sick database system back on its feet, earning you the ephemeral gratitude of its many users.

References

Shasha, Dennis. Database Tuning: A Principled Approach. Englewood Cliffs, NJ: Prentice-Hall, 1992.


Copyright © 1993, Dr. Dobb's Journal


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.