Dan is an independent security researcher and can be contacted at email@example.com. Wietse is a researcher at IBM's T.J. Watson Research Center. He can be contacted at firstname.lastname@example.org.
On September 25, at 00:44:49 in the U.S. Central time zone, a nastygram was sent to a machine of a friend of ours. The target was the rpc.statd service, which is part of the NFS (network filesystem) file-sharing protocol family. The intruder gained access to the system within seconds and came back later that same day. Figure 1 shows what was logged as a result of all this activity.
Upon closer investigation, this was a rather routine break-in. The exploit involved a vulnerability in the rpc.statd service that ships with RedHat 6.2 Linux and with other UNIX systems. The intruder overflowed a stack-based buffer with a bogus hostname, and thereby took full control over the rpc.statd process. This, in turn, gave the intruder full control over the entire system because the rpc.statd process runs with superuser privileges.
Postmortem examination of file-access time patterns revealed more detail. At 00:45:15, the intruder installed a backdoor login program; it was compiled on the spot from a few lines of source code. At 00:45:16, the intruder added an entry to a configuration file to enable logins via the telnet service; this service was already enabled at the time, causing a warning to be logged about a duplicate service. At 00:45:28, a telnet connection was made to verify that the backdoor was functional.
This was a quick and automated job. The whole break-in, from first contact to backdoor test, was completed in less than a minute. Later that day, the intruder returned to install some software for distributed denial of service (DDoS) attacks.
This article is not about intrusion detection. Instead, it is about being prepared for intrusion as something that will eventually happen. It is about building safety nets. No matter how strong your defense mechanisms are, they will eventually fail. By being prepared for the unavoidable security breach, you can avoid losing control and being taken by surprise. By being prepared, you have an opportunity to control the amount of damage.
This requires nothing but defensive programming techniques applied to systems and networks. As every good programmer knows, software will eventually fail, no matter how well it is written. The same applies to systems, networks, and security mechanisms. They will eventually fail. A system that is prepared for failure has safety nets in various places. Being prepared is all about making recovery from failure less painful.
Information Is Power
Being prepared is the name of the game. In this article, we discuss how you can instrument systems and networks so they can generate the information that you need to find out whether security was breached, and what the intruder might have done.
Using the break-in as the main example, we will illustrate what a successful intrusion can look like, depending on the level of instrumentation that you are willing to invest: We'll call them basic, advanced, and extreme. Each level of instrumentation generates increasing amounts of information.
We will introduce the main idea with measures that can be applied to individual systems, and discuss their network-based equivalents in a later section.
The main problem with system-based logging is that the information is hard to protect against tampering. With a critical system, you would arrange for off-host logging to a dedicated log host or even to hardcopy media. An old machine with no network logins would be an adequate log host in many cases. Hardcopy logging would be usable only for low-event rates and for small data volumes.
Basic System Instrumentation
Basic system instrumentation uses the standard accounting and logging mechanisms that come with the system. These maintain records for logins, changes of privileges, and whatever application-level logging is available from e-mail software, web servers, and so on.
With basic instrumentation, only the error logging from Figure 1 would be recorded the scream of agony from the rpc.statd server, and the complaint from the inetd service about a duplicate telnet service entry. There would be no record of the three telnet connection events in Figure 1. And of course, the backdoor login program made no record of the corresponding three intruder logins.
Advanced System Instrumentation
Advanced system instrumentation cranks up the noise level by including process accounting that shows all commands executed on the system, by including logging of inbound connections to network services such as telnet, and by lowering thresholds for logging in general. Some systems, such as RedHat 6.2, have connection logging on by default, while process accounting is an optional software package that must be installed by hand. With other systems, it is the other way around.
With advanced instrumentation, our rpc.statd intrusion generates all the logging that is in Figure 1, plus process accounting records that show the commands that the intruder executed. Unfortunately, process accounting was not enabled at the time of the actual break-in. Figure 2 shows a reconstruction, based on other data that we were able to collect.
Extreme System Instrumentation
Extreme system instrumentation goes beyond mere logging alone. It involves changing the system by installing bugs and traps. To prepare for intrusion, people instrument system software so that it can record every keystroke of a session with the increasing use of encrypted network connections, system-based session recording is the only option to keep a grip on what is going on. In addition to keystroke logging, people install booby-trapped versions of standard utilities that set off alarms when an intruder is rummaging around the system. These alarms are a safety net in case someone manages to slip past all other defenses.
Figure 3 illustrates extreme instrumentation with a piece of historical data. Once upon a time, a very persistent intruder spent days and days banging on the door of the Bell Laboratories Research firewall, using systems at Stanford University as a launching base. Figure 3 shows what the intruder saw when he first got in. The intruder did not know that Bill Cheswick and colleagues were watching the intruder all the time. And that was not all: Stephen Hansen and colleagues at Stanford University had instrumented their system software to monitor the intruder keystroke by keystroke. Meanwhile, Tsutomu Shumumura was doing the exact same thing at the network level. This amusing episode of an intruder being fooled was described by Bill Cheswick in "An Evening with Berferd" (available at ftp://ftp.research.bell-labs.com/dist/ches/berferd.ps). The story is also featured in Firewalls and Internet Security (Addison-Wesley, 1994, ISBN 0201633574), the classic textbook Cheswick coauthored with Steve Bellovin.
If advanced system instrumentation is like having still pictures of the scene, then extreme system instrumentation is like installing video cameras and burglar alarms all over the place.
Clearly, all this logging and monitoring comes at a price. Generating all the extra information not only eats up system resources, it can seriously affect the privacy of legitimate users. If you are monitoring systems at this level of detail, be prepared to spend a fair amount of time investigating failed attempts to break-in. In practice, extreme monitoring as shown in Figure 3 will be feasible only under special conditions.
Silence of the LANs
Preparation for trouble on the network level isn't important it's mandatory. Networks are transport, not storage, elements. Thus, all data must be captured and stored in real time or it will be lost forever.
While network recordings are undeniably useful, many people think that virtually all aspects of an intrusion or event could be monitored by an examination of the network packets that flow between the intruder and the target systems who needs to do anything on the host if you can watch everything that the intruder does? Unfortunately for a wide variety of reasons this is not the case.
The first problem is that you only get to see the actions, not necessarily the consequences of the actions. It's like watching someone getting shot in a video of a robbery: Were they hurt badly? Or perhaps they were wearing a bullet-proof vest. The true consequences of the actions may be found only by examining the victim.
Another problem is the sheer volume of traffic, which can preclude the processing or storing of even connection information and metadata about network traffic, let alone the content. Modest systems or small organizations can easily generate gigabytes of network traffic in a day.
In addition, even if you could capture all the traffic that flows through a network, activity could still be undetectable or undecipherable due to encryption; covert channels; connectionless traffic; use of back doors hidden in legitimate protocol traffic (HTTP, SMTP, and so on); incorrect, broken, or fragmented network packets; and a host of other issues.
That said, a network that is fully prepared for combat can be a powerful weapon. While at times it might feel as though a network has only two aqueous modes bone dry or fire hose wet, similar to the host-level controls, there are finer levels of control that can be quite useful. We'll start by examining an example of mild instrumentation and work up to more complex methods.
Being Prepared on the Network
While we've talked about gathering some network information from the host level, static snapshots of the network state can be even more easily gathered by talking to the devices that make up the network routers, switches, and other network access/network traffic flow gear.
While some data can be gathered by logging into the device in question, most devices do not have operating-system or data-storage facilities that let you store the results in the device itself. So, in practice, data is retrieved from the device using small programs or using a querying process such as SNMP. However, like general-purpose hosts, most network devices have logging facilities that can be turned on this is where our basic instrumentation level begins.
For example, a popular type of network logging is Cisco's NetFlow, which can be turned on by many of their routers and switches. A network flow is a data construct that tracks (in one direction only) a flow of traffic, keeping a wide range of metadata not content including the source and destination network addresses, start and end times, type of packets, port number, and the like.
The rpc.statd attack in Figure 1 would be seen in the NetFlow logs as something like the set of TCP packets in Figure 4 (heavily processed). General accounting and logging or packet monitoring can ascertain the type, amount of traffic, and other network metadata in time intervals or by connection (if applicable), as well as recognize interactive (user) network sessions. Information that is captured on the network device can either be pushed to a logging host or pulled via SNMP queries.
Care must be made to not overly trust such gross measurements, as useful as they can be. How do we know that these packets were actually statd related? People have, for instance, implemented faux DNS name servers and clients to offer tunneled IP over DNS that even uses compliant DNS queries to pass along the traffic. This can create a situation akin to a radio transponder being placed on a penguin to track its migration; but when a shark eats the bird, you might start tracking Jaws rather than an oil spill victim. Caveat networkus.
Going to a more advanced stage of instrumentation involves recording the contents of network packets. This is done with network sniffing software, either on a dedicated machine (ideally, with the network transmit wire physically severed) or on a general-purpose system. While it can require specialized software to capture keystroke logs on the host system, it's actually fairly easy to record the content of sessions at the network level, although encryption and other methods of obscuring data can still foil your efforts.
Network sniffers can provide detailed information about network traffic, but they must be judiciously placed, maintained, and protected. By logging all packets and content, you at least have a fighting chance of getting most of the data that you need to solve any mysteries that are uncovered. This level of detail is practical only if you have small amounts of network traffic or if you have lots of resources to store and process such data.
As Figure 5 shows, examining the rpc.statd attack with Snort, a popular packet sniffer, reveals great detail in a tightly constructed UDP packet of 1084 bytes. This packet is an attempt to coerce rpc.statd into an act of malfeasance by overflowing a buffer via a function call to the vsnprintf library routine. Now we can actually view the exploit code by putting the name of the Bourne shell on the stack in the correct location for execution, as well as the arguments to vsprintf. This sequence is what we saw earlier in both the erroneous "gethostbyname" log entry (Figure 1) and the Cisco NetFlow logs (Figure 4).
However, did the attack succeed? It may be impossible to determine with solely the information gathered from the network. This is where it is crucial to examine the host data as well. In the case of the rpc.statd attack, it's rather obvious that the attack succeeded because the next thing we see is a TCP connection from attacker to victim to install a login backdoor.
Most sites don't keep all the raw network data if the network is being monitored at all, filters and other processors are used in an attempt to keep only pertinent information. While throwing away any information is against our religion, there are times that it can be of great utility to do so. Indeed, for practical, legal, or ethical reasons, it might be inadvisable to record network packet content. Capturing sensitive or illegal traffic on the network can cause serious issues if you act or don't. In a strictly technical vein, the most troubling sorts of filters or processors are those that throw away the raw data and warn you with only processed or condensed information because you can never recover or reanalyze what was discarded, even if new evidence shows up that changes perceptions.
And while (obviously) most think it important to defend systems against unauthorized access or intrusion, it is even more critical to protect systems that monitor or aggregate network activity. It's bad enough that you might catch awkward or sensitive traffic you don't want the enemy to get it as well!
The site http://www.takedown.com/ has some excellent examples of the usefulness of packet sniffing.
Final Stage: Modifying Infrastructure
We've seen how we gather data and examine systems, turn on logging to increase our understanding of the same, even modifying systems to further gather information and confound our opposition ignorance. We'll now talk about the final extreme step, usually done by either the paranoid or those trying to learn modifying the actual infrastructure that the system rests in.
This involves actually setting up a dedicated infrastructure (in vitro or in situ) for intruders to peruse, usually to keep them enticed with your faux system(s) so that you can track them back across the network or to learn from their behavior. Shunting traffic by changing a router or switch, putting up fake systems for bait, and other methods are all examples of such behavior. (For more information, see the accompanying text box entitled "Caution: Honey Pot.")
We now have a small confession to make: The examples used here were taken from just such a system, designed to entrap an attacker. We didn't set it up ourselves, but were instead granted access to the data after the system had been broken into (by a "real live system cracker") in the hopes that we could find something out about the incident. Both network sniffer and host-based forensic information were made available for this article.
Interestingly, although the attack yielded new information about yet another denial of service tool not yet seen (ho-hum), we learned much more about how not to set up such infrastructure traps than anything else knowledge seems to come where you least expect it.
In the Kurosawa film Rashomon, we see many points of view of the same crime. While some data was corroborated, it is only after collecting all perspectives from the different witnesses that the truth is ascertained. Our traversal from the host to network levels has hopefully demonstrated that only with the marriage of the two can a complete picture of the past truly be constructed. Network monitoring sees verbs or actions, hosts see consequences. Indeed, you might well argue that a third piece the intent and mind of the user must be considered to put the final pieces of the puzzle into place.
Of course, most people are only prepared for an accident or incident after they've been burnt at least once. With individual systems, some host-level information can be recovered after an incident logging data, MAC time patterns, and so on. This isn't possible on an unprepared network that forgets data as soon as it has passed. But perhaps the main difference between host and network monitoring is that, in a host, you must act immediately to prevent data pollution or corruption (by either yourself or others), while in network monitoring, you need to act ahead of time or you won't have the information at all.
In all cases, you should attempt to save as much data as you can for as long as you are able. It's easier to save the raw network data than host-level information because the latter is almost inevitably processed in some fashion before storage.
Of course, in real life, it's rare to have the opportunity let alone time or inclination to examine an incident as thoroughly as we were able to in this case. However, double checking your work is never a bad idea!