Spying on .NET Applications
Using global hooks, the CLR hosting interface, and the .NET remoting interface to monitor the flow of events in your .NET apps
By Dmitri Leman
The .NET Framework has an ambitious goal of providing a completely new environment for Windows programming. The .NET library provides a complete replacement for most of the Win32 API, but there are few Win32 API features that have no corresponding classes in the .NET library. Most of these API routines can be accessed by a .NET program using Platform Invoke. One of the most important Win32 features, which cannot be easily accessed from a .NET application, is Windows hook support. Dino Esposito provides a partial solution for this problem in "Windows Hooks in the .NET Framework"(MSDN Magazine, October 2002 ), but his approach covered only local (thread) hooks, which are running in the same process as the .NET application.
The question of how to implement global hooks in .NET persists. This interest is understandable: For many years, global hooks were used to implement such powerful applications as Spy++, automated testing, accessibility tools, and popup stoppers. Lack of global hook support in .NET explains why Spy++ was not updated in Visual Studio.NET, unlike most other tools and utilities. Spy++ is the primary debugging tool for Windows UI programming because it provides a dynamic view of windows and messages, something a debugger cannot do. Unfortunately, Spy++ became almost useless for .NET programmers. It can only display windows and messages not .NET objects and events. When developing .NET GUI applications using the Windows Forms library, I felt a need to see a hierarchy of .NET objects, examine their properties, and monitor events.
That's why I developed a technique of injecting a .NET object into the address space of another .NET application, which allows me to bring Spy into the new age. In this article, I will explain how I used global hooks, the CLR hosting interface, the .NET remoting interface, and reflection to achieve my goal. Sources for a simple .NET spy are available for download from the Windows Developer web site as well as from my web site (http://www .forwardlab .com/). This simple spy displays a list of running .NET applications on a computer, and allows me to select one, inject an agent into it, display a tree of Windows Forms controls in that application, and watch the flow of events. I also developed a more sophisticated DotNetSpy application, which allows me to examine and modify properties on objects, execute methods, and select events to watch. It is also available from my web site. Using the presented technique and sources, it should be easy to develop powerful new .NET tools for debugging, automated testing, system monitoring, and so on.
After studying the problem, I divided it into four parts. The first is how a .NET application (spy) can inject code (agent) into another application space (target). The second is how the agent can attach to an existing instance of the .NET Runtime working inside the target. The third is how to examine .NET objects and monitor events in the target. And the fourth is how to communicate information back to the spy.
While looking for answers for the first question, I studied several techniques, such as the CLR Debugging API, the .NET remoting API, and Windows hooks. The debugging API is very powerful and will meet my needs. It allows, among other things, enumerating running processes, attaching to a process or starting a new one, and injecting a code into the debuggee and executing it. For more information about the debugging API see, "Common Language Runtime Debug Overview," an article on CLR Debugging in MSDN Magazine . I decided not to use the debugging API primarily because it will not allow spying on an application while running under Visual Studio debugger (since two debuggers cannot be attached to the same application simultaneously). I used the debugging API only to enumerate the running .NET processes.
.NET remoting is the official way of "accessing objects in other application domains." At first I was fooled by this title and hoped that it would solve all my problems. A quick study of .NET documentation revealed that remoting is based on the client-server model, which requires both applications to run objects that are designed to communicate with each other. This, obviously, will not meet my goal of injecting an agent into any .NET application without raising its suspicion.
Finally, I decided to use a Windows hook to inject the agent. A limitation of the hook approach is its inability to target applications without windows, such as Windows services. This is not a problem for the spy, which is designed to work with GUI applications. If there is a need to target nonGUI applications, other means of cross-application penetration can be used, such as the CreateRemoteThread API The hook approach requires writing a native (unmanaged) Win32 DLL, then calling this DLL from a .NET application using Platform Invoke. The local hook solution described in  did not require a native DLL because a .NET delegate was passed as a callback to the CallNextHookEx routine. Since the local hook callback is called on the same thread, the delegate remains valid and accessible. This, obviously, is not possible in my case since I want the callback to be called in the target application space. Therefore, I should either use a global hook (associated with all threads on the system) or a hook associated with the thread, which owns a window in the target application. The second question is the most challenging how can Win32 code peek inside a .NET Runtime environment and inject a .NET agent into it? One approach is to use the CLR Debugging API, but we already discarded it.
The second technique is to use the CLR hosting interface. The key routine of this interface is CorBindToRuntimeEx. As the name of this function suggests, it does exactly what I need: It lets Win32 code bind to the CLR environment. Unfortunately, the documentation says that hosts use this API to load CLR into a process but does not mention that it can also be used to attach to an already running CLR. Therefore, I will have to use this routine in an undocumented way. There is a danger that in future versions, Microsoft may patch this loophole and break my spy. Then I will have to find a more tricky way to achieve my goal. But in .NET Versions 1.0 and 1.1, CorBindToRuntimeEx works perfectly. It provides access to the IcorRuntimeHost interface, which lets hosts start and stop CLR, enumerate domains running in the process, and create and configure new domains. The hook code will enumerate existing domains (represented by the _AppDomain interface), then call the CreateInstanceFrom method on each domain to create an instance of the agent .NET class inside that domain. For a more traditional use of CLR hosting, see . This completes the most complex part of the spy design. At this point, the agent object is instantiated inside the target application and is ready to take it under control.
To explore the structure of objects in the target application, the agent will use methods of the System.Windows.Forms.Control class. This is the base class for other classes in the System.Windows.Forms assembly, which have corresponding Win32 windows. The .NET Framework maintains a hierarchy of Control-based objects, which mostly corresponds to the hierarchy of windows. Therefore, to let the spy display a windows hierarchy, the agent should find the top-level control in the target application and then recursively enumerate its children. The agent will use the Control.FromHandle static method to get a Control object corresponding to the window handle passed by the spy. Then the Control.TopLevelControl property will give the main control and the Control.Controls collection will provide access to children. The agent will also bind handlers to several events in each Control object, such as Click, GotFocus, KeyDown, MouseDown, and others. These handler routines (located in the agent) will report events to the main spy for printing. A more sophisticated spy should allow the user to monitor selected events on selected controls, but the simple spy included with this article will monitor a fixed selection of events on all discovered controls.
Spy should also display properties of objects. This can be done by direct reading properties of the Control object, such as Width, Height, and so on. Alternatively, the agent can use the .NET reflection API to examine all fields and properties (including private) of the Control object. Here lies the biggest advantage of .NET Spy over old Spy++. Over the last decade, many new controls were introduced such as tree, list view, and so on, but Spy++ was frozen, displaying only the coordinates, style, and a few other properties of a window. Like a Visual Studio debugger, .NET Spy can display all fields and properties of whatever class the target application uses to implement a window.
The final question is: How can the agent communicate the collected information back to headquarters? Here .NET remoting fits nicely. When designing a communication using a remoting API, it is necessary to assign the roles of client and server, determine the activation mode, choose a channel type, and decide what classes have to be transported, how to transport them (by value or by reference), and how to share metadata. In the case of Spy, I decided to make the main Spy application play the role of a server and make the server object a singleton. One or many agents may be simultaneously active in different target applications. An agent should instantiate a client class, which should connect to the server and get a reference to the remote server object using the System.Activator.GetObject method. Then the client should call a method on a server and pass a reference to itself. The server should maintain a collection of all currently running clients (agents). This should allow the spy and the agent to call methods on each other to pass requests, get results, and monitor events. When the target is terminated normally, the destructor of the client should be called and the client should remove its registration from the server. Both client and server classes should belong to the same assembly packaged as a DLL. A path to this DLL should be passed to the _AppDomain.CreateInstanceFrom routine discussed earlier.
The .NET remoting interface provides a choice of two channels: HTTPChannel and TCPChannel. HTTPChannel uses HTTP protocol and SOAP format (by default) to transport method calls and objects. HTTPChannel is the best for communication across the Internet and through firewalls. Since Spy and agents run on the same computer, TCPChannel is the best choice because it has less overhead. Each client and the server should register an instance of TCPChannel with two important properties: port and name. Most .NET samples use a fixed port number, but it may cause conflicts with other applications. Therefore, I decided to pass port number 0 to have the system automatically assign an unused port. Then the server should call GetUrlsForUri to get the URL (which includes an automatically assigned port) and the client should use that URL to connect to the server. Spy will use the HookDLL to pass the URL to the agent.
The final issue to decide is which objects should be exchanged between the agent and the spy. Some objects are remotable while others are nonremotable. Nonremotable objects cannot be represented in another application domain. Remotable objects can be marshaled either by reference or by value. The objects marshaled by reference are represented in another domain by proxies. A client calls the proxy, which transfers calls to the original object. All modifications made by the client to the state of the object stays with the object. Objects marshaled by value are copied and recreated in the remote domain. All calls and property changes made by the client to these objects only affect the copy and are never propagated to the original object. As I already explained, the server and the agent should exchange references to be able to call each other. This means that both the client and server should be marshaled by reference. This is achieved by deriving these classes from System.MarshalByRefObject. Finally, the agent will use several small classes to pass collected information to the server. These classes should be marshaled by value, which is done by marking them with the [Serializable] attribute. I considered passing Control and other objects from the target application to the spy by reference to let the spy examine them using reflection. This worked for standard .NET classes, but an exception was thrown if an object from the application's private assembly was passed. This means that unless the spy loads all private assemblies of all target applications, it should not directly touch references to their objects. Therefore, the agent should dump all properties and fields of these objects to a string and pass them to the spy as a string. Now the whole picture of the spy design is clear and shown in Figure 1.
As Figure 1 shows, there are three components in Spy: Spy.exe, InjectLib.dll, and Hook.dll. The first two are .NET-managed components and will be written in the C# language. Hook.dll is an ordinal Win32 DLL and will be written in C++. Visual Studio.NET can manage these different types of projects in the same solution (workspace). Therefore, I first created a blank solution called "SimpleSpy," then added a new Visual C# project using the Windows Applications template and named it "SpyGUI." After that, I added another Visual C# project using the Class Library template and named it "InjectLib." Then I added a new Visual C++ project using the Win32 project template, named it "HookDLL," and selected option "DLL" in the Win32 Application Wizard.
Next, I wrote the Windows hook implementation in the file HookDLL.cpp. It exports a single function, InjectSpyAgent, with four arguments: target window handle, path to InjectLib assembly DLL, agent object name, and server URL. This function copies arguments to a shared memory area, sets the CallWndProc hook for the target window's thread, sends a message to that window, and removes the hook. The CallWndProc hook routine calls the Bind routine (shown in Listing 1), which does the actual injection. It is important to avoid accidental loading of the .NET Framework into applications that don't use it. Therefore, HookDLL should not statically link to CLR libraries and should not call LoadLibrary on any of them. Instead, Bind calls GetModuleHandle("mscoree"). "mscoree" is the DLL that exports the CorBindToRuntimeEx method, which gives access to the main CLR hosting interface ICorRuntimeHost. Then it calls the EnumDomains method on this interface and enumerates domains using the NextDomain method. Each domain is represented by interface AppDomain, which is a native representation of managed interface System._AppDomain. This interface has (among others) several overloaded CreateInstance methods. The native version of the interface has these methods numbered (because the C language does not support overloading). The Bind routine calls the CreateInstanceFrom_3 method to instantiate the agent in the domain. In order to use these interfaces, HookDLL.cpp has the line #import "mscorlib.tlb", which includes a native definition of many .NET interfaces (including _AppDomain). I also included headers "mscoree.h" for the CLR hosting interface definitions and "corhdr.h" for various other .NET definitions. I added the function EnumProcesses to the HookDLL to enumerate .NET processes using the IcorPublishProcessEnum interface from the .NET debugging interface.
To implement the second component, named "InjectLib," I renamed "Class1.cs" (created by Visual Studio) to "InjectLib.cs." This source file will contain the namespace InjectLib with a few classes: SpyServer, AgentClient, MemberDescr, and ClassDescr. SpyServer (shown in Figure 1) is a singleton performing the role of a .NET remoting server. It is instantiated by the Spy GUI and is used to communicate with AgentClient. AgentClient is instantiated inside the target application(s) by the HookDLL. As explained in the design section, SpyServer and AgentClient extend MarshalByRefObject, allowing them to exchange remote references with each other. Two small classes, ClassDescr and MemberDescr, are used to pass information from the agent to the server. These classes are marked by a [Serializable] attribute to be marshaled by value. All their members must be serializable as well. I only used members of type String and ArrayList. At the beginning of InjectLib.cs there are several using statements that permit use of classes in Windows Forms, remoting, diagnostics, collections, and reflection namespaces without prepending full names to all classes. To compile successfully, the project should have references to assemblies containing all used namespaces. Some references were added automatically when the project was created, but I needed to add references to System.Windows.Forms and System.Runtime.Remoting manually using the Project|Add Reference dialog.
At first, Spy GUI calls the static Init method in the SpyServer class. This method creates and registers an instance of the TCPChannel class using ChannelServices.RegisterChannel() and an instance of WellKnownServiceTypeEntry with RemotingConfiguration.RegisterWellKnownServiceType(). This well-known service type entry creates an association of a URI (I use string SimpleSpy) to the class (SpyServer) and the activation pattern (singleton). Then I pass the URI SimpleSpy to GetUrlsForUri on the channel to get the URL such as "tcp://localhost:1031/ SimpleSpy." 1031 is an automatically assigned port, and SimpleSpy is the URI associated with the SpyServer class. Now any other process on the same computer may get access to the single instance of SpyServer inside the Spy GUI by calling Activator.GetObject(typeof(SpyServer), URL). The SpyServer.Init method has a few lines dealing with the incompatibilities between .NET Runtime 1.0 and 1.1. To strengthen security, Runtime 1.1 does not allow passing object references through remoting channels unless the TypeFilterLevel property of the serialization provider is set to TypeFilterLevel.Full. This breaks programs written for 1.0 SDK. Programs written for 1.1 will not work under Runtime 1.0 because property TypeFilterLevel does not exist in 1.0. To let the spy work under both run times, I decided to use reflection GetProperty() and SetValue() methods. I also wrote the PrintRemotingConfiguration method, which prints all registered client and service type entries to help in debugging.
Spy GUI needs a reference to the instance of SpyServer, but it cannot simply call new SpyServer() because the remoting framework will create another instance later, which will violate the singleton design. Therefore, the Spy GUI calls AgentClient.Connect() the same method that will be called by the AgentClient constructor inside the target application. Like SpyServer.Init, AgentClient.Connect also instantiates and registers TCPChannel and then calls Activator.GetObject with the server's URL to get a reference to SpyServer. Since Spy GUI calls Activator.GetObject in the same domain that the server is registered, a direct reference is returned. When other applications call the same Activator.GetObject, they will get a reference to a proxy representing SpyServer.
SpyServer has a RegisterAgent method, which is called from the constructor of AgentClient and UnregisterAgent from the AgentClient.Dispose. These methods add and remove agents to/from the hash table m_Agents. Another method, GetAgent(), is called by the Spy GUI to find an agent in m_Agents for the given process ID. Currently, the simple spy is limited to one agent per process and cannot differentiate between domains. The ReportEvent() method is called by agents to report an intercepted event in the target. Finally, the AgentClient.Dispose() method disconnects all agents, so target applications will not hang after the Spy GUI terminates.
Besides the Dispose and Connect methods already mentioned, AgentClient has GetRelatedWindows and GetWindowProps methods. They are called by SpyServer when the GUI needs to retrieve the window tree and properties of individual window. AppendWindow is an internal method that is called recursively to generate the window tree. Another internal method, AddEventHandlers, adds handlers to several events on a given window. Then there are several event handler routines, such as MouseMoveEventHandler. All of them format event arguments into a string and call ReportEvent, which forwards the call to SpyServer.ReportEvent. This concludes the implementation of the InjectLib component.
The next step is to implement the Spy GUI. This will be a very simple GUI with two windows classes. The first class (ProcessesView) will be a form with a list view and three buttons: Close, Hook, and Refresh. The list view will be populated with running processes on the computer. The second class (ObjectView) will also be a form with a label and tree view. This view will be used to display a window's hierarchy and properties for a selected window. Events observed in the target application will simply be printed to the console along with debug information. I used the Visual Studio form designer to define the GUI layout, modify properties, and assign button-click handlers.
Next, I wrote the routine RefreshProcList, which calls System.Diagnostics.Process.GetProcesses() to get an array of all processes on the current computer. Unfortunately, I didn't find any way from the managed code to determine whether a process has a .NET environment. Therefore, I added a call to the EnumProcesses function in HookDLL to enumerate .NET processes. Then I added code to add .NET processes to the list view. The RefreshProcList routine is called from constructor and from click handled for the Refresh button. It appears that the Process object returned from GetProcesses() does not hold any resources and, therefore, calling Dispose() is not necessary. Then I added the InitServer method to perform SpyServer initialization. This routine is executed on a separate thread started by the constructor of ProcessesView. A handler for the Hook button first gets a selected list view item and a Process object associated with it. Then it calls the InjectSpyAgent method from HookDLL, SpyServer.GetAgent to get a remote reference to the agent in the target process. After that it calls GetRelatedWindows on this agent and, finally, passes the returned window tree to a new ObjectView window object. To compile the call to the unmanaged InjectSpyAgent function, it is necessary to write a prototype at the beginning of the ProcessesView class. I also assigned OnObjectViewAction as a handler for the ActionEvent in the ObjectView. This method (called when the user double clicks on a window in the tree) calls AgentClient.GetWindowProps and creates another instance of ObjectView to display the properties of the selected window. ObjectView implementation is simple. It has two overloaded SetInfo methods to populate the tree view with either window tree or object properties.
Spy has code running in different processes some code is managed, some native. The operation of installing a hook, injecting an agent, and establishing a connection was the most difficult to troubleshoot. It may have been possible to use a combination of Win32 and managed code debuggers, but I decided to use simple print statements to the console. I wrote a simple test application, which contains a single form with many different controls. I converted the Spy GUI and Test applications from Windows Application to Console Application using the Output Type property on the projects. Then I wrote a batch file, TestRun.bat, which uses the "start /b" command to launch Test.exe and SpyGUI.exe from the same console. I injected a lot of Console.Out.WriteLine calls into most managed methods in the spy and printf in HookDLL. I also ensured that all exceptions are caught and printed.
At first glance, the technique presented here is a security breach because it injects an agent into an unsuspecting application and takes it under control (calling methods, modifying fields, and so on). Closer examination shows that the spy does not present a higher risk than any other Win32 application. Any Win32 application can install hooks, create remote threads, and use Win32 and .NET debugging APIs to break into other applications (some limitations may be imposed by NT security). The .NET environment has more sophisticated security, which allows the customization of access to specific resources. By default, code originating from the local computer receives the Full Trust permission set, which gives it access to all resources. The spy always operates on the local computer because Windows hooks cannot be installed remotely. Therefore, by default, the AgentClient can access objects, fields, and properties (including private) in the target applications. There are ways to customize .NET security policy on different levels (enterprise, computer, user, and domain). Therefore, it is possible to ban a specific assembly (for example, SpyGUI or InjectLib) from accessing certain resources (for example, unmanaged code). But it is not feasible to explicitly specify all assemblies, which may use the injection technique. It is also possible to programmatically customize .NET security. For example, attributes may be added to an assembly, class, or a member to restrict access. While testing the spy, I found that the ShowParams property in the System.Windows.Forms.Control class has an attribute that bans access by anybody except for the System.Windows.Forms assembly. When AgentClient tried to retrieve this property, an exception was thrown. The current version of the simple spy displays this exception instead of the property value (don't be surprised to see a security exception on the console while running the spy). Therefore, my conclusion is that the spy plays by Win32 and .NET security rules and does not pose an additional security risk. Of course, downloading unknown applications from the Internet and running them locally is very dangerous.
The injection technique presented here uses a combination of a Windows global hook, .NET CLR hosting, remoting, and reflection to build a key to unlock the door to the .NET Runtime environment. It can be used to build a .NET version of Spy++ and automated testing tools. It will be interesting to see what other new tools and applications can be developed using the injection. Unfortunately, a danger remains that the CorBindToRuntimeEx function will be modified in the future versions of .NET to close the existing backdoor. Until that happens, we have a good opportunity to debug and test our .NET applications using .NET Spy and other new tools.
1. "Windows Hooks in the .NET Framework," Dino Esposito. MSDN Magazine, October 2002.
2. "CLR Debugging: Improve Your Understanding of .NET Internals by Building a Debugger for Managed Code," Mike Pellegrino. MSDN Magazine, November 2002.
3. "Implement a Custom Common Language Runtime Host for Your Managed App," Steven Pratschner. MSDN Magazine, March 2001. w::d
Dmitri Leman is a consultant in Silicon Valley specializing in .NET and Java development. He can be reached at [email protected] Forwardlab.com.