Testing with Malware Traces
We present the results from running our detection algorithm with traces collected from real botnets. Recall that we detect three different types of anomalies: burst anomalies, triggered by large changes in traffic distribution; persistence anomalies triggered when destinations are communicated with regularly, even with very little traffic (such as botnet C&C channels); and commonality anomalies, triggered when a number of network users begin to exhibit correlated behavior. These anomalies correspond to the three types of alarms output by our system. Table 2 lists some well-known malware types, indicating what types of alarms are likely to result from each.
Botnet TracesWe collected traffic traces from three distinct botnet families. We executed bot code on a host and logged packet traces for a week, by using the same host over multiple weeks to run the three different bots. The host was wiped clean in between collections, and a pristine copy of Windows XP was installed. Also, we turned off the auto-update functionality and configured the firewall to drop all incoming connections. From each trace, we discarded all packets that did not have a source or destination address corresponding to the host. The packet traces were converted to flows by using Bro , and the rest of the analysis uses flows. One of our goals in this section is to understand the detection of the different behaviors; that is, the attack behavior and the channel behavior (when the malware calls home). In the traces we collected, we saw both. Because many bots in the wild do not generate much volume (and try to remain undetected), detecting the control channel is of critical importance. We briefly describe the three Botnets and how the flows were classified:
SDBot. An SDBot is a well-studied botnet that uses IRC as the channel but on a non-standard port. However, the IRC servers are easy to pick out from the domain names, for example irc.undernet.org. The traces revealed two distinct atoms in the control ?ows. The remaining flows consist of scans being run on a neighboring network prefix. We noticed a large number of scans on ports 135, 139, 445, and 2097 (a well-known commercial anti-virus product). In the traces, we see connections on the well-known IRC ports and use this knowledge to identify control traffic (the IRC traffic) and attack flows.
Zapchast. This botnet also uses IRC as the channel and uses the well-known IRC ports (6666 and 6667). We saw a total of five IRC service atoms (about 13 distinct IP addresses) in the traces. The attack traffic was predominantly netbios traffic.
Storm. This botnet is P2P-based and very different from the others. The traces are two orders of magnitude larger than the other botnets. Lacking a single destination server or a well-defined port, it was quite hard to identify the control channels and we had to rely on some heuristics to do this: the fact that Storm uses UDP to connect to the P2P is documented.
We looked at distributions of the UDP flows (flows with two-way traffic) and noticed a very large number of packets that were of a small, fixed size (the flows were on non-standard ports and unlikely to be attacks). We took these flows to be an indicator of maintenance traffic and isolated all the ports involved. UDP flows to this set of ports are assumed to be part of the control channel. We did see a much smaller number of HTTP and SSH flows that may also be control related; the volume of these flows is such that it does not affect our results. The attack traffic for Storm is overwhelmingly on TCP port 25 (SMTP).
In the rest of this section, we discuss the detection of persistence anomalies, and we defer the analysis of commonality anomalies due to space limitations.
Detecting stealthy behavior with p-alarms. To validate the detection of the control channel in each of the Botnets, we first identify the distinct atoms that can be extracted from the control traffic. For each of these atoms, we compute persistence over the lifetime of the (malware) trace. Recall that we compute this at five different timescales. For the purposes of detection, we consider the atom to be flagged as a p-alarm, if the value at any timescale exceeds the threshold p = 0.6. We found that this threshold is associated with the fewest false alarms per day and the best detection rate, where the rates were averaged over all the destination atoms for all the malware traces.
In Figure 3, we plot the maximum persistence value for each of the atoms. The Y axis indicates the value used for p. The scatter plot contains three distinct markers for each of the botnets, and each mark plots the persistence value for the corresponding atom. We plot a vertical line at p=0.6, which is the persistence threshold used by our detection system. Atoms that occur to the right of the vertical line are flagged by our system as possible C&C destinations. The particular threshold, i.e., p=0.6 was selected so as to achieve the best tradeoff between minimizing the number of false positives (i.e., normal, benign destinations flagged by our method as C&C destinations), and maximizing the detection rate (i.e., the fraction of C&C destinations that we correctly flag).
The SDBot traces revealed exactly one atom, and this atom appears toward the top right of the plot. It is the largest marker and is shown as a triangle. The Zapchast traces contained exactly nine atoms, all but one of which appear to the right of the vertical line. Finally, the Storm traces contain approximately 82,000 atoms with persistence levels evenly distributed (for convenience, we only plot a sample of 100 atoms). While persistence is reflected on the x-axis, the vertical bands indicate different timescales. Thus, a point in the bottom band indicates the persistence value is associated with the 1-hr timescale.
We plot the maximum persistence for each destination atom, so the band indicates the timescale at which the persistence value maxed. Looking over the points, we see that the SDBot atom and eight of the nine Zapchast atoms are easily detected, appearing to the right of the threshold. For the single Zapchast atom to the left of the threshold, we noticed exactly two connections, close to each other, over the entire trace. We conclude that these connections do not really count as regular. We point out that these particular botnet instances are stealthy and generate very few connections. One of the atoms (to the right of the line) was associated with 30 connections over a whole week, with at most one connection in a window. This behavior qualifies as being close to indistinguishable. However, the persistence value for this atom is 0.7 and is above the threshold. This particular example drives home why a system such as ours is required to detect stealthy malware. With malware becoming more stealthy and with developers building in extraordinary measures to keep it from being detected, looking for volume-based anomalies is unlikely to have much success.
With the rapid evolution of botnets toward increasingly stealthy behavior and the staggering numbers of end-hosts already infected by such malware, there is a dire need to develop and deploy techniques to counteract these problems. In this article, we reviewed the latest in botnet behavior and trends to elucidate the shortcomings of traditional approaches that depend on rule-based and/or volume-based detection. Bots and botnets are able to evade anomaly detection in part because they are polymorphic in nature and thus are considered a new vulnerability with every new sighting; their communication behaviors deliberately mimic that of normal end-hosts, and thus they stay below detector threshold settings.
As a result, we analyze the behavior of real Intel enterprise end-host background traffic and contrast it to real botnet C&C channel activity. Consequently, we are able to develop and present the Canary end-host detector, designed to root out the botnet command and control channel by tracking the persistence of a node's relationships with destination hosts, and the commonality of persistence across multiple peers -- both fairly stable properties of non-botnet traffic. The strength of these methods requires no a priori knowledge of the botnets that are to be detected, nor do they require traffic payload inspection.
 "An Inside Look at Botnets." Paul Barford and Vinod Yegneswaran. In Series: Advances in Information Security, Springer, 2006.
 Symantec. "2H 07 Threat Horizon Report."
 USA Today. "Botnet scams are exploding." March 17, 2008. At http://www.usatoday.com/money
 Damballa. "Damballa announces discovery of Kraken BotArmy," April 7, 2008. At http://www.damballa.com
 F-Secure. "Calculating the Size of the Downadup Outbreak." January 16, 2009. At http://www.f-secure.com
 F. Giroire, J. Chandrashekar, G. Iannaccone, D. Papagiannaki, E. Schooler, and N. Taft. "The cubicle vs. the coffee shop: Behavioral modes in enterprise end-users." In Proceedings Passive and Active Measurement Conference (PAM'08), Springer Verlag Lecture Notes in Computer Science, pages 202-211, Volume 2979, April 2008.
 D. Dash, B. Kveton, J. M. Agosta, E. Schooler, J. Chandrashekar, A. Bachrach, and A. Newman. "When gossip is good: distributed probabilistic inference for detection of slow network intrusions." In Proceedings of the 21st National Conference on Artificial Intelligence, (AAAI'06), pages 1115-1122, July 2006.
 Bro. At http://www.bro-ids.org
The development of the Canary detector was a collaborative research effort with Frederic Giroire, Nina Taft, and Dina Papagiannaki.
This article and more on similar subjects may be found in the Intel Technology Journal, June 2009 Edition, "Advances in Internet Security".