Channels ▼
RSS

C/C++

C++ Exceptions & the Linux Kernel

Source Code Accompanies This Article. Download It Now.


Dr. Dobb's Journal September, 2005

Leveraging the power of C++

By Halldór Ísak Gylfason and Gísli Hjálmtysson

Halldór was a research assistant at the Network Systems and Services lab at Reykjavik University and chief architect at Calidris. He is currently a Ph.D. student at the University of California at Berkeley. Gísli is a professor of computer science at Reykjavik University. He previously was a member of the technical staff at AT&T Bell Labs, where he researched networking systems and services, optical networking, router architectures, modeling, and performance evaluations. They can be contacted at halldorisak@ru.is and gisli@ru.is, respectively.


Exceptions are common in user-space programming. Most modern programming languages offer some form of syntactic constructs to handle exceptional events. The belief is widespread that the use of exceptions leads to more maintainable and robust systems, as error-handling code is separated from the normal flow. Multiple modern programming styles and best practices encourage the use of exceptions to handle exceptional cases. When a function detects an error, an exception is raised, which directs the program flow to the nearest dynamically enclosing handler for that type of exception. Thus, exceptions are handled differently from normal procedure exits, and exceptions are transparently passed through functions that do not handle the error.

In spite of their common use in user space, the use of exceptions in kernel space has been limited. In fact, some operating systems do not exploit higher level language abstractions at all. In particular, Linux is written in pure C. Whereas performance issues may negate the use of some modern languages such as Java, one of the driving factors behind the creation of C++ was for use in writing operating systems. Some constructs were specifically introduced and designed based on observed patterns—and to address problems—in operating system implementations. Although C++ does not enforce strong type safety or other safety properties (such as Java), C++ offers an array of high-level language abstractions valuable for the construction of operating systems, and offers type safety and compiler support far beyond that of C. In particular, the safety provided by language-level polymorphism provides significant value as polymorphic behavior is widespread throughout any operating system.

In our work on the Pronto software router (http://netlab.ru.is/pronto/pronto .shtml), we have used many of the advanced C++ constructs extensively, including classes and virtual functions to achieve clarity, flexibility, and extensibility. We have shown that these benefits come at no performance penalty compared to the Linux implementation (see "The Pronto Platform: A Flexible Toolkit for Programming Networks using a Commodity Operating System" by G. Hjálmtysson. Proceedings of OpenArch 2000, March 2000). Our desire to employ the full range of C++ abstractions in the kernel and, in particular, to use C++ exceptions in our work on Pronto is the driver behind the work presented herein.

Of course, handling exceptional conditions is relatively expensive, regardless of the mechanisms employed for implementation. Substantial fraction of operating system code (for any operating system) is there to resolve exceptional conditions. Handling such conditions requires:

  • Detecting when an exceptional condition occurs.
  • Determining where (by whom) such conditions should be handled.
  • Doing the work needed to recover resources and otherwise handle the condition to return the operating system to a state from where it is safe to resume normal execution.

The cost of the first and the last requirements are independent of the mechanisms employed to implement the second. The use of language-level exception handling translates into machinery providing complete and systematic approach to the second—to identify where a thrown exception should be handled.

The performance cost of using the language-level exception machinery must be weighed against both its nonperformance benefits and the total cost of handling a given exceptional condition. Important nonperformance quality metrics include reliability, robustness, flexibility, maintainability, and speed of development. In contrast, ad hoc exception-handling patterns, common particularly in the Linux kernel, consist of convoluted traces where exceptional function abort is communicated to the caller via an exceptional return value. The performance overhead of throwing exceptions in C++ is appreciable when compared, for example, to a simple return with an integer error code. However, in many cases, executing the exception-handling code dominates the total cost of recovering from an error condition. In those cases, the substantial benefits of exceptions warrant the relatively small cost.

Clearly, throwing a C++ exception is more expensive than returning from a function. Therefore, there are cases where using exceptions should be avoided. The latter would typically apply when the exception handling is trivial and a simple return value is appropriate, or for exceptions that occur relatively frequently (and thus, perhaps constitute a branch rather than a true exception) or operate on such a fine time-scale that even a small overhead is burdensome. However, rather than voiding the viability of using language-level exceptions, the added cost instead determines the granularity appropriate for the use of exceptions, and/or the rarity at which the exceptional case must occur to justify the cost.

Implementation of Exception Handling

There are two main approaches to implementing exception handling in modern languages such as C++ (see "Exception Handling for C++," by Andrew Koenig and Bjarne Stroustrup. Journal of Object Oriented Programming, July/August 1990). The first approach employs dynamic registration based on the setjmp/longjmp methods. This method manages a stack of execution contexts—each entry to a try-block pushes an execution context onto the stack through the setjmp method, while each exit pops an execution context off the stack through the longjmp method. This approach incurs some (albeit small) cost when entering try blocks, but has the advantage that it is portable in the sense that it is possible to generate ANSI C from a C++ program.

The second approach is based on a statically generated table of program counter values that map try blocks into program counter values for exception handlers. When an exception is thrown, this table is searched for the appropriate handler using the current program counter value as entry point. This method incurs no runtime overhead when entering a try block (zero instructions), at the expense of increased overhead when throwing and during exception.

The current GNU g++ compiler implementation, on which we have built our runtime library, implements exception handling using the table-driven approach. Former versions of the compiler used the setjmp/longjmp method, and that version can still be compiled into the implementation. The GNU implementation is contained in the (user-level) application binary interface (ABI), which accompanies the compiler as part of the standard runtime library. When using C++ exceptions, GNU g++ generates calls to the ABI; for example, the throw operator is transformed into a call to the ABI function __cxa_allocate_exception followed by __cxa_throw. Versions 3.x of GNU g++ implements the C++ ABI specification for the IA-64 (see "C++ ABI for Itanium: Exception Handling," http:// www.codesourcery.com/cxx-abi/abi-eh.html). The aim of the C++ ABI specification is to standardize the object layout and the interface of the object code to the runtime system. Thus, code compiled with old versions of the compiler should be compatible with newer releases, and object code from different compilers should be compatible.

Employing a language-level exception mechanism at kernel level has not become common in practice. Some operating systems are written at least partly in C++ but generally do not employ C++ exceptions. However, Windows NT provides a facility called "Structured Exception Handling" (SEH) (see "A Crash Course on the Depths of Win32TM Structured Exception Handling," by Matt Pietrek. Microsoft System Journal, January 1997), which can be used in kernel device drivers. SEH is an operating-system facility and is, therefore, independent of any compiler or language. SEH uses dynamic registrations of try blocks and thus does affect normal program flow. Each thread is associated with a stack of exception registrations, which contain a function pointer to a handler function. Although SEH is similar to C++ exception handling, it has different semantics. In the SEH model, exceptions are thrown explicitly with the RaiseException Win32 routine and—in contrast to the C++ exception model—exceptions in the SEH model are singular integers. However, the SEH model also covers processor-level errors, such as divide-by-zero, access violation, and stack overflow, which requires support from the OS.

When an exception is raised in the SEH model, the operating system calls the handler functions on the exception registration stack in sequence. Each handler function decides whether to handle the exception, to pass it through, or to resume execution at the point where the exception was raised. One drawback of SEH is that it does not call class destructors during stack unwinding; however, the SEH-specific __finally block can be used to clean up resources. The Microsoft C++ compiler implements C++ exceptions on top of the SEH model, and as a consequence, a catch(...) block catches all C++ exceptions as well as processor-level errors, such as segmentation faults. Segmentation faults in Linux are not caught when compiled with the GNU g++ compiler.

Implementing the C++ Runtime Support

The first step in our work to support kernel-level exceptions was to create and include an implementation of the C++ ABI in the kernel. We started by carving out the user-level GNU ABI implementation as the basis for our implementation; however, simply porting user-level code to the kernel requires some changes. The malloc function used to reserve space on the heap for exceptions is replaced with the kernel-level kmalloc function. Furthermore, the GNU library is threadsafe and uses the pthread library for locking. However, as the pthread library is not available in kernel space, we modified the locking mechanism to use kernel-level spinlocks.

The semantics of the C++ throw operator requires the implementation to keep a stack of active exceptions to support the usage of throw without any arguments. In user space, this is done using thread-local storage. To achieve this effect, in kernel space, we augmented the Linux task_struct to include information on the active exceptions; see Listing One.

When creating ELF executables, GNU g++ silently links in two object files at the front and the back (crtbegin.o and crtend.o, respectively). Furthermore, GNU g++ adds initialization code into the ELF .init section, and clean-up code into the .fini section. This is necessary to ensure that global constructors and destructors are run, and that the exception tables are registered with the ABI. To enable the usage of exceptions and global objects in the kernel, we modified the Linux makefile rule for the kernel image and kernel modules to link with those two files. Furthermore, we ensure that the initialization routines are called on module load by clever use of preprocessor macros, which modify the definition of the module initialization and finalization functions, module_init and module_exit. This is necessary because the kernel module loader in Linux pays no attention to the ELF .init section.

We implemented several optimizations to the standard GNU implementation of exceptions because our original measurements indicated that the cost of using exceptions was somewhat high. One optimization concerns the fact that the standard GNU implementation unwinds the stack in two phases—the first phase locates a handler, and the second phase unwinds the stack to the handler. The rationale for this two-phased approach is that, in the case that no handler is found, the stack frames have not been destroyed and debuggers can inspect the state of the frame that threw the exception. However, for our use at kernel level, we don't see this cost justified. In fact, we feel that even in user space, programmers should have the option of having the compiler optimize this debugging help out of the code.

To enhance the performance of exceptions further, we included an optimization that improves the mechanisms used to search for the handler of an exception by caching frame state of functions, which is used to restore registers that have been saved on the stack. Our optimization caches this frame state data in a hash table, indexed by program counter. When an exception is thrown the first time through a function, or more specifically, the first time through a certain place in the function, the frame state is computed and subsequently inserted into the hash table. Subsequent throws through this place result in a successful lookup in the hash table. The importance of this optimization increases as more exceptions are used in the kernel. This is because the time needed to locate the frame descriptor entry for a function is proportional to the number of modules that use exceptions and the number of functions within those modules.

The cost of dynamic type checking in C++ is highly dependent on the method used to encode the runtime type information in the objects. GNU g++ follows the traditional approach and associates with each class a type information object that encodes the type of the class as a mangled string and puts a pointer to this object in the virtual table for the class. GNU g++ uses weak symbols to reduce the dynamic type checking to a pointer comparison, thus avoiding the more expensive string comparison. Each time a class containing virtual functions is used in a source file, GNU g++ generates the virtual table, type information object, and type name string as weak symbols. The user-space linker, ld, ensures that there is only one copy of this object, which renders the simple pointer comparison sufficient. However, the kernel module loader, which in the 2.6 versions of the kernel is exclusively in kernel space, does not handle these weak symbols and always relocates references to weak symbols to the weak definition within each object file. Therefore, multiple type information objects may exist for the same class, and pointer comparison becomes insufficient when doing dynamic type checking across kernel modules. To avoid this overhead, we modified the Linux kernel module loader to handle these weak symbols; the first time a weak symbol is encountered, it is added to the symbol map, and on subsequent encounters, the relocation is done relative to the first symbol.

Using Kernel-Level Exceptions

The C++ kernel-level runtime support for Linux provides complete runtime support for C++, including support for virtual functions, memory allocation operators, global constructors/destructors, dynamic type checking, and exceptions. The code is installed by applying a patch to the Linux kernel and enables the full use of C++ using the GNU g++ compiler.

Using our new C++ kernel-level runtime support, programming in C++ at kernel level becomes similar to programming in user space. The compiler compiles files with the suffix .cc as a C++ file. However, the Linux kernel distribution is written in vanilla C, so consequently, C++ source files need to include C files to interface with the Linux core. This introduces a problem not commonly encountered in user space, as some of the C++ keywords have been used as identifiers in the Linux header files. To combat this, we have provided two inclusion files—begin_include.h and end_include.h—with our distribution that should be used to enclose the Linux C header files, as in Listing Two. These two files use #define and #undef, respectively, to redefine these identifiers to names accepted by the C++ compiler.

In the following examples, we use a class hierarchy (Listing Three) where we group the exceptions according to sub-systems, using inheritance. The top-level exception class—OSException—consists of a message, severity, and the virtual method report that, by default, prints the message through printk if the severity is MAJOR or FATAL. The other two exceptions defined are NetworkException, derived from OSException, and ProntoException, derived from NetworkException.

System Calls

The most straightforward use of exceptions in kernel space is in system calls. The Pronto architecture introduces three new system calls to the Linux kernel. New types of packet processors (an abstract data type for processing in the data path of the Pronto router) can be plugged into the operating system at runtime and their behavior manipulated through type-specific system calls that are dynamically linked through virtual functions. To promote safety, it is beneficial to catch all exceptions thrown by packet processors.

Listing Four shows the use of exceptions to guard a system call. In this example, the sys_pproc_type_call is the entry point from the system call. Its only function is to dispatch a method invocation to thePProcKType, which is an object in a dynamically loaded module. To guard the dispatch, the virtual call is performed inside a try block.

The system call catches all ProntoExceptions and calls the virtual function report. The packet processors can throw subclasses of ProntoException and customize the report function. Finally, all other exceptions are caught with the second clause. This could include processor-level errors if the OS provides support for mapping processor-level errors into catchable exceptions.

Using Try Blocks in the Data Path

The Pronto data path consists of a classifier that maps packets to flows. Each flow is associated with a forwarding path consisting of chains of packet processors. Each forwarding path may have multiple branches. Examples of packet processors include basic IP forwarding, tunnel entry/exit, NAT functionality, and more. Packet processors are dynamically added to the router at runtime.

Listing Five shows how we employ exceptions in this critical part of the data path. A try block guards the processing of a packet as it is sent through the chain of packet processors associated with the flow they belong to (identified by the call to the classifier above the try). As in the previous example, there are two catch statements, one catching all ProntoExceptions, the other catching all.

It is worth noting, in this example, that as new types of packet processors, say for example IPSec tunnel entry, are introduced, they may in turn introduce new subclasses of the ProntoException, defining a new handler (the report method). This way, the Pronto data path is capable of performing type-specific exception handling for new, dynamically installed types.

Evaluation

For the purpose of measuring the absolute cost of throwing an exception, we implemented a kernel module that throws an integer out of a function. Our measurements were performed on an Intel Pentium 3, 996.859 MHz running the Linux 2.6.6 kernel that has been patched to include Pronto and the C++ runtime library.

To put the numbers into context, we measured the performance of the Linux printk function, which is commonly used in exceptional circumstances to communicate error messages. The time to print a string of length 6—printk("Error\n")—was measured to be 18.15 ms. Typical usage of printk include formatting the strings, which is even more expensive.

As expected, the cost of exceptions is dependent on the number of stack frames that the exception is thrown through. Table 1 tabulates how the number of stack frames affects the cost of throwing an exception, using our optimized implementation.

We observe that cost increases about 0.35 ms with each stack frame. However, when using other types of error-handling techniques, the cost also increases with the number of stack frames traversed.

For comparison, the cost of using exceptions without our optimizations is tabulated in Table 2. We observe that the increase in cost for each stack frame in the GNU g++ implementation without our optimizations is around 2.5 ms. Hence, the effect of our optimization is even more impressive when throwing the exception through multiple functions.

Finally, we measured the effect of using try blocks in the data path when no exceptions occur. We observed an increase of 2.8 percent in packet latency—consistent with the results of other writers. Using exceptions induces a slight overhead, even though no instructions are executed on entry to try blocks, which can be accredited to less successful optimizations of compilers.

Summary

In this article, we discussed our C++ kernel-level runtime support for Linux, which lets you use the full power of C++ in kernel-space programming, including global constructors and destructors, dynamic type checking, and exceptions.

DDJ



Listing One

struct task_struct {
   volatile long state;
   struct thread_info *thread_info;
   ...
#ifdef CONFIG_CXX_RUNTIME
   struct {
     void *caughtExceptions;
     unsigned int uncaughtExceptions;
   } cxa_eh_globals;
#endif
}; 
Back to article


Listing Two
#include <begin_include.h>
#include <linux/module.h>
#include <linux/kernel.h>

#include <end_include.h>
Back to article


Listing Three
class OSException 
{ 
public: 
     char* getMessage();      
     OSException(char* msg,int sev);
     int getSeverity(); 
     virtual void report(); 
     enum tSeverity {MINOR=1,MAJOR,FATAL}; 
private: 
     char* message; 
     int severity; 
}; 
class NetworkException : public OSException 
{ 
   ...
}; 
class ProntoException : public NetworkException 
{
   ...
}; 
Back to article


Listing Four

asmlinkage int 
sys_pproc_type_call(int pptype, int call, void* args) 
{ 
  int retval = -ENOSYS; 
  try { 
    if (thePProcKType) {
      retval = 
         thePProcKType->syscall(pptype, call, args); 
    } else { 
      printk(KERN_ERR "pproc not loaded");   
    } 
  } catch(ProntoException & exception) { 
         exception.report(); 
  } catch(...) { 
        printk(KERN_ERR "Unknown Exception occurred"); 
  } 
  return retval; 
} 
Back to article


Listing Five
int pronto_ip_rcv(struct sk_buff *skb, .) 
{ 
  ...  
  flow = classifier->lookup(skb); 
  if( flow ){
    try { 
       flow->arrive( skb ); 
    } catch(ProntoException & exception) { 
       exception.report(); 
       kfree_skb(skb); 
    } catch(...) { 
     printk("Unknown exception occurred"); 
     kfree_skb(skb); 
    } 
  }
  ...
} 

Back to article


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video