Channels ▼

Andrew Koenig

Dr. Dobb's Bloggers

Data Structure Audits

November 15, 2012

I've already mentioned the first one: using two separate processors. I imagine that each processor had its own power supply, and probably its own connections in and out of the central office. The idea was that even a complete failure of either processor should not affect the other.

More Insights

White Papers

More >>

Reports

More >>

Webcasts

More >>

The second strategy was that each processor was constantly monitoring the other one. If a processor stopped responding, or started behaving abnormally, the other processor would restart it. This strategy is similar to the firewall notion that I described last week: Even if one processor went haywire, it is very unlikely that the failure could affect the other processor because there was so little communication between the two. Therefore, the odds are overwhelming that if a processor failed, the other one would restart it and life would go on.

I already knew about these first two strategies. What I learned from the switching expert at this conference was that these strategies weren't good enough by themselves to meet the reliability requirements. The reason was that bugs, both hardware and software, would gradually corrupt the processors' internal data structures to the point where that corruption would interfere with the switching system's operation.

This problem lead to the third strategy: All data structures in the system were designed so that they could be audited and repaired. Whenever a processor had nothing else to do, it would go through all of the data structures in its memory, looking for anomalies in each data structure and rebuilding the structure part of that data structure from the corresponding data. He told me that this auditing strategy was so effective that when it was turned off, a processor would crash every day or two; but with auditing turned on, years could elapse between crashes.

The approach of designing data structures to be auditable, and auditing them from time to time, has two advantages. Not only does it simplify recovery after failures (and sometimes makes those failures invisible, or nearly so, if the failing component can be restarted automatically), but data-structure auditing is a powerful debugging tool.

In fact, it's more than that — it's a bug prevention tool. If you have an effective data auditor, you can call it in an assert statement at strategic points throughout your program. If it ever detects a failure, you have found a bug — even though you might never have been able to construct an external test case to trigger that bug. And even though continuously auditing your data structures may make your program unacceptably slow for production, you can always turn on NDEBUG in the version of the program you ship. If someone sends you a test case that causes the program to fail, you can try it again with NDEBUG turned off; perhaps the data-structure auditing will find that bug too.

In short, if you can divide your data structures into a data part and a structure part, and you can write an audit program that is capable of rebuilding the structure from the data, you can use the auditor not only to make your programs more reliable, but to get them working more quickly than you might be able to do otherwise.

Related Reading






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video