Dr. Dobb's | Memory-Aware Components

Memory-Aware Components

In an ideal world, your programs gracefully handle out-of-memory conditions and keep running. But in the real world...

September 05, 2007
URL:http://www.drdobbs.com/embedded-systems/memory-aware-components/embedded-systems/memory-aware-components/201804198

Kirk is a software developer working in the Rational Software division of IBM. He can be contacted at [email protected].

Most software that runs out of memory simply crashes. On modern operating systems, this happens when programs require more virtual memory than is available. A program that reserves or commits too much virtual memory can run out of free space. When that happens, it can misbehave: Heap allocations may fail, new threads may not start, stacks may fail to grow, and so on. It might exit politely or crash at that point, often without so much as a complaint about what's really gone wrong.

Ideally, your programs gracefully handle out-of-memory conditions and keep running. At the least, they might provide some detailed diagnostic output, or find ways to cope with the situation and survive until resources become available again. Such positive outcomes are possible if the program can identify virtual memory that is not actually required at the time when the low-memory problem occurs. Given modern component-based programming techniques, the program's components typically don't have enough information about each other to understand and comply with one another's virtual memory requirements as the "memory pressure" builds. While the information-hiding aspect of good component-based design is useful in many ways, it can hamper the components' ability to share limited virtual memory resources.

There are two kinds of unused virtual memory that become interesting when your program is memory-starved:

Reserved virtual memory.
Unused committed virtual memory.

Reserved virtual memory occupies part of your program's virtual address space, most likely because a component has proactively set it aside, yet it is clearly not being used. Committed, but otherwise unused, virtual memory can be found when components aren't making prudent choices regarding their memory footprints. There's nothing that prevents a component from appropriating either reserved or unused committed virtual memory to prevent a crash, if your components have enough intelligence to recognize suitable ranges that might be appropriated. The appropriated ranges can be used to provide space for the preparation and presentation of diagnostic output, or for any other purpose required to keep your program running when it could not otherwise continue.

Some components proactively reserve virtual memory and don't use it. By reserving virtual memory, a component typically renders it unusable by other components of the same program that are running in the same process—unless the other components are "smart" enough to realize they could commit that memory for their own use, rather than cause a crash by running out of memory. You can build this sort of intelligence into your program, making your components smart enough to not die from out-of-memory conditions when reserved, uncommitted memory is available. To do that, you need a small virtual memory analyzer/watchdog component that records information about how virtual memory is used by your program. This information may include a set of timestamps recorded as the program runs, each time virtual memory is reserved. When virtual memory runs low, your other components might need your analyzer/watchdog component to grab the reserved memory region that has the oldest timestamp. The region is then available to be committed for whatever purpose necessary for the program to stay alive.

Some components proactively commit virtual memory and never use it. It may be left in an uninitialized state, or filled with a simple pattern—usually all NULLs. On most modern platforms, virtual memory is committed at least a page at a time (for example, a page occupies four kilobytes of virtual memory on most versions of Windows). Pages of memory left uninitialized can be decommitted by any component in a process, and no harm results unless the committed memory is actually needed by the component that originally committed it. The same virtual memory analyzer/watchdog component that monitors reserved virtual memory may be extended to make software components smart enough to not die from an out-of-memory condition whenever committed, uninitialized memory is available. The analysis can involve tracking the pages of memory as "in use" or "not in use." One way to tell whether a page is in use is to run a compression routine, such as the LZW algorithm, on the contents of the page. This can be done for each page tracked by your analyzer/watchdog component when an out-of-memory condition occurs and when reserved, uncommitted pages are unavailable. When all or part of a page is found to be initialized, the page can be considered in use. The analysis can also involve recording a timestamp, during the run, each time virtual memory pages are committed. When virtual memory runs low, your other components might need your analyzer/watchdog component to decommit the uninitialized page with the oldest timestamp, so that the page can be reused as needed to keep your program running.

Designing a Watchdog

Figures 1-5 show a way you could design a simple virtual memory analyzer/watchdog component that helps your program stay alive, at least long enough to get some diagnostic information out, if not longer. The idea is to track virtual memory as the various components in your program reserve or commit it. For this to work, you need code that intercepts the system calls that are responsible for reserving, committing, and freeing virtual memory. On some operating systems, such as Linux, the choice of which calls to intercept is straightforward: On Linux, virtual memory regions are created via calls to mmap() and released via calls to munmap(). On other operating systems, such as Windows, there's no documented API function that's responsible for creating all of the virtual memory regions for use by your process. However, there is an API function, VirtualAlloc(), which you can use to create some regions. If you debug into VirtualAlloc(), you'll reach an exported function that's called for most, if not all, of the regions your program creates. On current versions of Windows, including Vista and XP, this function is called NtAllocateVirtualMemory(). This function is paired with NtFreeVirtualMemory(), which is invoked to release regions.

There are numerous ways to arrange function interception. A simple approach is to replace the first few bytes of the function to be intercepted with an instruction, such as a jump, that passes control to a routine that you'd like to invoke whenever that function is called. Your routine can then restore the first few bytes of the intercepted function, call it with its original parameters, intercept it again, and then do any processing that you have in mind. This simple approach can meet the interception needs of the virtual memory analyzer/watchdog concept in Figures 1-5. Listing One (available at www.ddj.com/code/ does all of this on x86 architecture systems running Windows. Note that the code reached via the jump instructions can be improved for multithreaded programs if you add some form of serialization mechanism so that the target system calls will always be intercepted when a new thread comes along. Some suggestions regarding the placement of synchronization calls are provided in Listing One, which sets up a handler for out-of-memory exceptions. The routines called by this handler will make use of the information tracked via the intercepted functions that create and release virtual memory regions.

You can arrange the interception of system calls to take place automatically when your virtual memory analyzer/watchdog module loads. This is accomplished (on Windows) in Listing One by doing the interception within the module's DllMain() call. That way, only one line of code is needed to load the module and kick off its virtual memory tracking mechanism on the fly. On Windows, the relevant line of code is a LoadLibrary() call; see Listing Two (also available at www.ddj.com/code/). After this call, the virtual memory allocation/deallocation calls in Listing Two are intercepted. Alternatively, you can dispense with the LoadLibrary() call altogether and link your virtual memory analyzer/watchdog module statically. If you statically link your watchdog to a component that loads at the beginning of each run, that causes virtual memory tracking to start early in the run, giving your watchdog more regions to choose from if virtual memory runs low.

Figures 1-3 describe virtual memory tracking routines that can be called from the intercepted functions. The effectiveness of these routines depends on what percentage of the actual unused virtual memory regions or pages have been tracked, by the time excessive memory pressure brings your program to a halt. For that reason, the interception should be done at the lowest possible level to catch the most possible regions. It should also be set up as early as possible during the run. The regions can be tracked in a list that's ordered by the regions' base addresses. In Figures 1-3, some data items are associated with each tracked region. These data items include the call chain leading to the creation of the region, and a timestamp. The timestamp is used when the program runs out of memory, to pick a region that's been unused for a long time as a target for reuse. The call chain can serve to identify the component responsible for creating the region. Listing Three (available at www.ddj.com/code/) provides a simple routine for collecting a call chain on the Intel platform.

Figure 1

Figure 2

Figure 3

Stealing Regions

The vectored exception handler used in Listing One is a perfect fit for dealing with out-of-memory exceptions generated by any component in your process. With this kind of exception handling, even components you didn't develop yourself, such as third-party modules loaded by your program, benefit from your watchdog's protection. The DllMain() routine in Listing One sets up the module's vectored exception handler, up front, before intercepting the target system calls. Because this handler invokes routines that depend on virtual memory tracking, as arranged by intercepting these system calls, the interception is not performed unless the handler is successfully put in place first. Vectored exception handling is available on current versions of Windows, including Vista and XP. If your operating system doesn't support vectored exception handling, then you need to provide some means of handling out-of-memory exceptions on each thread that is started. Like the vectored handler in Listing One, your handler can invoke a routine that uses tracked data about your program's virtual memory regions to keep the program alive, as in Figure 4, which introduces the concept of stealing regions.

[Click image to view at full size]

Figure 4

Figure 5

In Figures 4 and 5, a reserved virtual memory region is considered stolen when it's committed for use by a component that didn't reserve it. Similarly, an unused committed page is considered stolen when it's unreserved and recommitted for use by a component other than the one that originally committed it. How do you know the original component won't come along and try to use its reserved or committed space? You don't—but your watchdog can try to head off this possibility by stealing the memory that has been unused for the longest amount of time, out of all the memory it's tracking. The real protection involves, well, protection. On Windows, the VirtualProtect() API function can be applied to each stolen page or region to generate an exception when the region is accessed. The use of an exception handler to deal with all accesses to a region means, of course, that your program could run slowly when components start stealing regions. On the other hand, poor performance is generally more tolerable than a crash.

If your program is going to stay alive for long after stealing a region that was originally reserved or committed for some other purpose, then you need a way to tell one component from another. A simple way to do this is based on call chains. Each component itself occupies a region, or a set of regions, where it's loaded in virtual memory. Your watchdog may track those regions along with all the others. If so, then comparing the base addresses of the regions associated with components is a matter of a lookup in your region list. Better yet, you can call an API function, such as GetModuleHandle() on Windows, to find the base address of each module that appears along the call chain.

Your watchdog needs to be able to recognize calls into modules that represent API code or allocator code, so that it won't compare the modules containing this code. Doing so confuses it, making it fail to recognize call chains coming from different components. Because many components share a common allocator, the call chains associated with region creation typically end up in the same few functions. That's why you can't tell components apart by the last few entries of a call chain associated with region creation. But if you'll spend some time debugging your tracking code, particularly in the routine that implements Figure 1, then you get to know where common region creation code lies in virtual memory. By ruling out these ranges during call-chain comparison and instead looking to the next caller beyond one of these common routines for any given chain, you can get an idea of whether the component responsible for stealing a region is the same component that is now accessing it. If so, then your watchdog can safely unprotect the region and go ahead with the access. Otherwise, the time has evidently arrived to steal another region.

An obvious benefit of this region tracking/stealing scheme is the ability to construct a detailed report when an out-of-memory condition arises. Often, when a program runs out of memory, it can't even do so much as complain before the inevitable crash occurs. The technique of making available any unused virtual memory for use in constructing and displaying diagnostic output can be very useful in and of itself. The added ability to provide information about the virtual memory regions that are being used, including information about which components created them and when they were created, can provide the clues you need to prevent similar out-of-memory conditions down the road. You don't need to implement a component-recognition scheme to realize this benefit, if you're happy enough to get your diagnostic output and to let the program crash. But if you want to keep your program alive longer, you will probably want to clean up any virtual memory committed for diagnostic purposes as soon as you can after making your diagnostic information available.

Of course, the watchdog component itself adds to your program's memory footprint, but only modestly. The entire watchdog module can contain perhaps 3-5 times the amount of code in Listings One through Three. The tracking data for virtual memory regions is minimal, because there are typically not more than several hundred regions to track, even for a large and complex program. Plenty of components, even commercial ones such as some JVMs, leave large amounts of virtual memory reserved. If that reserved memory can be reused to keep your program alive in the face of otherwise overwhelming memory pressure, then your watchdog has earned its keep.