Channels ▼


Red-Team Application Security Testing

Source Code Accompanies This Article. Download It Now.

Nov03: Red-Team Application Security Testing

Testing techniques designed to expose security bugs

Herbert is director of security technology and Scott is director of testing technology at Security Innovation LLC. They can be contacted at, and, respectively.

Feature Scoring
Testing Techniques and Tools

Testing for software security usually means simulating attacks through the network against entire systems. This is evident by the volume of penetration testing tools that have popped up, including SATAN, SAINT, and Retina, among others. However, one of the biggest problems in network security is that intruders might exploit a buffer overflow in an application that is accessible through the network. An industry has emerged to fix these problems at the network level using hardware devices such as firewalls and intrusion-detection systems. But the truth is that, if the underlying software that runs on target systems were more secure, the need for these types of patching measures would be reduced or eliminated.

Using firewalls and testing at the network layer is not the answer. One problem with it is that network-penetration testing turns security testers into librarians who expose well-known, reemergent vulnerabilities with no hope of finding new ones. Often, we have seen so-called "penetration tests" that basically correlate to a few hundred automated scripts representing known exploits. This paradigm has become the standard not just for network security testing (where it is arguably more effective), but for application security testing as well. To thoroughly test applications for security though, you need to test like detectives not librarians. In this article, we describe a methodology for finding the underlying causes of these vulnerabilities—bugs in software. This method helps organize application-penetration testing through decomposition of an application, ranking of features for potential vulnerabilities, and allocation of resources.

The Security Testing Problem

Why do you need application security testing? Isn't it covered by functional testing, specification-based testing, regression testing, and all the other types of standard verification procedures that software-development organizations use? Unfortunately, the answer is a resounding "no!" We realized this several years ago when security bugs came into the limelight. We found that the underlying flaws in software that let attackers exploit applications or networks were rarely flaws that violated some requirement or rule in the specification. Instead, the flaw turned out to be some side effect of normal application behavior. Consider, for instance, a notorious bug in the Pine 4.3 e-mail client for Linux. Under certain configurations, Pine 4.3 creates a temporary file for messages being edited through its user interface in a file in the /tmp directory, which is globally accessible (see for details). This means that attackers could read any message from any user on the system while it was being composed. Is this a bug? Well, certainly it's a security issue of the highest severity, but it doesn't fit the model of a traditional functional bug. Mail could be successfully composed and sent, and such test cases were likely executed thousands of times. The side effect of writing out to temporary, unprotected storage just wasn't noticed by testers and developers. For more information on the side-effect nature of security vulnerabilities, see "Testing for Software Security" (DDJ, November 2002) or How to Break Software Security: Effective Techniques for Security Testing, by James Whittaker and Herbert H. Thompson (Addison-Wesley, 2003). The hidden nature of most security bugs is the reason applications need specific, focused security testing. This is testing that defies the traditional model of verifying the specification and, instead, hunts down unspecified, insecure side effects of "correct" application functionality.

"Red-teaming," "penetration testing," and "security testing" are all terms that express the same basic idea—short, focused, intense security testing of applications. This testing is independent of the development group and usually falls outside of normal application-testing channels—that's the point; it's independent. Red-teaming lets testers attack an application in ways an intruder is likely to. But this still isn't effective enough. An application opens itself up to potentially thousands of man hours worth of attacker effort once it is released. Security testers must work more efficiently and with greater accuracy than intruders do in order to have any hope of catching the majority of security defects in an application. This is what red-teaming is all about and why the need for it is so acute.

The Methodology

One of the key needs in creating a short, focused security assessment of applications is to quickly identify which areas of the software are most likely to be vulnerable. For this, we decompose an application into features and score these features for insecurity. During this process, we show how to identify the testing strategies and attacks that are likely to be bug-revealing for that feature. From this information, we develop a plan and assign people to roles: people to investigate components, people to execute tests, and people to develop or acquire tools. This feature-based testing lets us draw conclusions about component strengths and weaknesses in very specific terms that give developers the information they need to fix the problems. This model (see Figure 1) has been used to successfully conduct penetration tests for large software companies and find vulnerabilities that have stopped shipment on many sizeable commercial products.

Decomposition of the application means partitioning the application's features into manageable testing areas. The method of partition can vary, but ideally it is guided by two questions:

  • Is this feature of manageable size for a single individual, operating alone or with a small team, to explore its functionality and conduct tests in a relatively short time?
  • Does this feature form a natural partition in that most functionality is contained within this feature and there are few interfaces between it and the rest of the application?

Imagine, for example, a music player that plays both streaming media from the Web and files stored either locally or on remote machines. One simple partition of the application may be:

  • Reading of files from the local filesystem.
  • Communication through the network with streaming media.

  • The GUI.

  • Storing of favorites and other user-preference data.

There could be other possible divisions of the application. For large applications, there are likely to be dozens of features. To cope with this, you must then decide how to allocate testing resources to these features. There are several criteria you could use based on the number of inputs, proportion of users that are likely to use that feature, or lines of code. For functional testing, these are certainly reasonable criteria since the goal would be coverage and the likelihood that users encounter the bug (that is, use that feature). For security testing, your reasoning is different. Since the focus is short, intense testing, you should allocate more resources to the components that are more likely to contain vulnerabilities. See the accompanying text box entitled "Feature Scoring" for more information.

Once features are scored, they are assigned to testers who manage the evaluation of the components. Testers have two primary responsibilities at the onset of component testing:

  • Determine what tools are necessary or would be helpful in executing tests. Requirements for these tools are then passed on to developers within the test organization who search for a low cost or free tool that can be used; if such tools cannot be located, they develop the tool. Bear in mind these are small, focused, special-purpose tools likely to have a short development time.
  • Identify testing techniques that would be useful in exposing vulnerabilities in the component (see the text box entitled "Testing Techniques and Tools").

As more is understood about the component during the test-execution process, there may be changing requirements for testing tools. For this reason, test developers and testers work hand-in-hand to produce new tools as needed. When vulnerabilities are found, problem reports are created that send information including reproduction steps, hardware configuration, operating-system details, tools needed to reproduce the failure, and any other relevant information to the stakeholders in the testing effort (the internal product development group). Once the project is over, these reports form the basis for postmortem bug evaluations.


Bugs are corporate assets. There is no better way to understand what your organization is doing wrong than to thoughtfully analyze bugs that escaped the normal development and testing processes through a postmortem evaluation. This analysis helps refine the testing process so that those types of bugs are found sooner in future security-testing endeavors. Postmortems are best done soon after the security-testing project has ended, when bugs are still fresh in the minds of the testers who found them.


Ideally, development and testing practices in the industry will move to accommodate the need for security-aware measures. Until then, red-teaming is perhaps the best practice to use.


Thanks to Matthew Oertle of Security Innovation for providing code excerpts of our in-house network-corruption tool.


Listing One

/* Network Corruption excerpt 
  By Matthew Oertle
  This is the callback function for libpcap <>
  u_char *data is a pointer to the incoming packet
void Callback( u_char *user, const struct pcap_pkthdr *header, 
                                                     const u_char *data ) {
    // Structures for packet fields
    EthHdr ethOut;
    IpHdr  ipOut;
    TcpHdr tcpOut;
    offset = 0;
    ethOut = (EthHdr)data;
    offset += ETH_H;

    // Take care of Layer 2 addressing
    memcpy(ethOut->src_mac, externalMAC, 6);

    // Look at IP packets
    if(ethOut->protocol == 0x0800) {
        ipOut = (IpHdr)(data + offset);
        offset += ipOut->hlen * 4;

        // Look at TCP packets
        if(ipOut->protocol == 0x06) {
            tcpOut = (TcpHdr)(data + offset);
            offset += tcpOut->hlen * 4;
            // Check if it is the port we are interested in
            if(tcpOut->dest_port == TEST_PORT) {
                // Call the corruption function
                corrupt_payload(data + offset, data_len - offset);
                // Re-compute the checksum
    // Inject the modified packet onto the wire
    libnet_write_link_layer(iface, device, data, data_len);

Back to Article

Listing Two

/* This function takes a pointer to the packet data and the length hi and lo 
are global functions that initialized to 0xff and 0x00. The function corrupts 
a single byte each time the match string is found in the packet
int corrupt_payload(u_char *data, int len) {
    if(memmem(data, len, match, match_len)) {
        data[lo] = hi;
        if(hi == 0xff) {
   return len;

Back to Article

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.