Dr. Dobb's is part of the Informa Tech Division of Informa PLC

This site is operated by a business or businesses owned by Informa PLC and all copyright resides with them. Informa PLC's registered office is 5 Howick Place, London SW1P 1WG. Registered in England and Wales. Number 8860726.


Channels ▼
RSS

Design

Database Design: How Table Normalization Can Improve Performance


The Cost of Fragmentation

Table 1 shows the cost of fragmentation in size and time. This table shows each design separately, measuring how storage size grows and how performance grows due to fragmentation. These results cannot be used to compare between designs. It only shows how a particular design gets larger and slower due to fragmentation.

The minimal impact of fragmentation occurred with only two updates. The maximum effect was with 10 updates.

Fragmented Size Fragmented Time
minmax minmax
Denorm1.58 2.67 1.111.13
Semi-Norm 1.63 2.13 1.491.56
Norm 1.73 2.23 1.462.04

Table 1: Cost of fragmentation in size and time.

The MESS storage grew by a factor from 1.58 to 2.67, as expected. Each update replaced NULLs or short strings with longer strings, taking up more storage, leading to rows being placed elsewhere in the file structure. Storage was consumed rapidly by updates to the MESS.

The MESS query time, however, did not grow as rapidly as the storage did. This is interesting, and most likely reflects the very small size of the sample data (100 rows). Since the database only occupies a few physical blocks, it can be read quite rapidly in spite of fragmentation. A larger database would have a larger performance penalty.

The partially normalized storage grew by a factor from 1.62 to 2.12. Separating the columns that change from the columns that are static reduces fragmentation, as expected. Query performance, uses joins and unique indexes, grew by 49 percent to 56 percent after the series of updates.

The fully normalized storage grew by a factor from 1.73 to 2.23. The fully normalized version had one row in each table before fragmentation, and a number of rows after fragmentation. Query performance grew between 46 percent and 104 percent after the series updates. This dramatic slow-down us likely due to the change in cardinality from 1:1 to 1:n.

Comparison

Comparison between structures reveals that the partially normalized design has a performance penalty of just 14 percent compared with the MESS design. Without fragmentation, the partially normalized structure may actually return results faster than the denormalized MESS. The fully normalized structure, with a 1:n join has a performance penalty of 68 percent to 131 percent when compared with the original MESS.

Bottom Line

A semi-normalized design does not endure the same level of fragmentation as a denormalized MESS design. Since it uses a 1:1 join instead of a 1:m join, the performance is generally quite good. Further, change can often be isolated to the extension table, offering some protection from devastating change. The rate of fragmentation is the lowest and the performance penalty from a 1:1 join is also quite low.

The management overview is this:

  • Fully normalized is slow, but lower maintenance and easiest to enhance. It requires some analysis to determine dependencies.

  • Fully denormalized can be fast but suffers fragmentation; it is higher maintenance and hardest to enhance. It requires minimal analysis of dependencies or update patterns.

This semi-normalized version, however, requires the most insight to create. It requires understanding the attributes and their usage. Investments made in understanding the application data and processing can pay dividends by reducing administrative busy-work and reducing the risk of problems that are caused by that administrative overhead. Further, understanding the application can lead to optimization of the data and the associated processing.

Steven is a database developer who focuses on data warehousing and the associated e-business architectures. He can be contacted at [email protected]


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.