GNU's C Language Extensions

GNU's GCC compiler has a number of interesting and useful ISO C99 and non-ISO extensions (among others) to C that are commonly overlooked. These features can help simplify the development of C applications and make them easier to debug.


May 01, 2005
URL:http://www.drdobbs.com/gnus-c-language-extensions/184401956

May, 2005: GNU's C Language Extensions

M. Tim Jones is a software engineer and the author of GNU/Linux Application Programming (2005) and BSD Sockets Programming from a Multilanguage Perspective (2003, both Charles River Media). Tim is currently a senior principal engineer at Emulex and can be contacted at [email protected].


In addition to being the standard GNU/Linux compiler, and the de facto standard embedded compiler, GNU's GCC has a number of interesting and useful features that are commonly overlooked. These features can help simplify the development of C applications and make them easier to debug.

In this article, I examine some ISO C99 and nonISO extensions to the C language provided through GCC and demonstrate their use. I focus on the most recent GCC 3.4 compiler, since it has introduced many new useful features. GCC 3.4 ships as the Standard C compiler in the Fedora Core 3 Linux distribution.

Language Features

GCC provides a number of useful additions to the C language. While I look at five examples here, numerous others are available.

Type Referencing with typeof. The typeof operator lets you refer to a variable's type through the variable itself. It's similar to the sizeof operator that returns the size in bytes of a given variable. Consider this declaration fragment:

int i;
typeof(i) j;

What I've done is created an integer variable named i, then created a new variable j of the same variable type of i using typeof. Not a very interesting example, but consider the function in Listing 1, where I declare a local variable as the temporary swap variable using typeof, referencing one of the macro arguments. This routine can be used as a general-purpose swap function that operates on any arithmetic type (char, int, float, unsigned int, and so on). Otherwise, multiple functions would be required for each of the fundamental types.

Case Ranges. GCC permits the specification of consecutive ranges within case statements of a switch. Listing 2 illustrates this. Note that the spaces around the "..." are relevant. This technique can help simplify if-then-else chains with multiple ranges.

Designated Initializers. Standard C requires array initialization to be complete and specified in order. ISO C and the GNU Extensions let arrays be designated and occur in any order. For example, these two initializations are identical:

int array1[8]={0, 0, 0, 3, 0, 5, 0, 0};


int array1[8]={[3]=3, [5]=5};

Initialization can also be done with ranges (as demonstrated with the case statement). These examples are identical:

int array1[8]={0, 0, 1, 1, 1, 2, 2, 2};

int array1[8]={[2 ... 4]=1, [5 ... 7]=2};

As with case ranges, the spaces around the "..." are necessary.

Variable-Length Arrays. GCC permits the declaration of arrays using nonconstant expressions. This is possible in ISO C99, but not in C89. Typical array declarations have the form:

int array[ 10 ];

but nonconstant lengths can also be specified, as:

int array[ func() ];

where the size of the array is the return value of func. It's also possible to declare nonconstant size arrays as arguments to functions. For example:

void check( int len, int array[len] )

creates a variable length array, with its length based upon the first parameter of the argument list.

Zero-Length Arrays. Standard C requires all arrays to contain at least one element, but in GNU C, you can declare zero-length arrays. This can be very useful in applications where the size of the array needs to be dynamic. Consider Listing 3, in which the zero-length array becomes an array of len bytes when returned from getPayload.

Using Attributes

With attributes, you can instruct the compiler to treat functions or variables specially based upon the attribute used. Traditionally, attributes have been used to identify interrupt handlers or to force functions within named sections. But GNU provides some other useful function and variable attributes.

Inline control. Inlining functions is a common technique to help increase the performance of an application. Performance is increased by avoiding the call/return instructions and by additional frame management. GCC can do this automatically, given a function size threshold (and the appropriate optimization level enabled), but in some cases, you know exactly what to inline and what not to. The use of the attributes noinline and always_inline can be used for this purpose.

Specifying each of the attributes is performed with the function prototype:

void smallFunction( void ) __attribute__ 
  ((always_inline));

void largeFunction( void ) __attribute__ 
  ((noinline));

The inline function modifier can also be used, but requires that optimization be enabled. These attributes work explicitly on the functions, whether optimization is enabled or not.

Warning of Unused Return Value Usage. The compiler can be instructed to emit a warning whenever a function's return value is ignored using the function attribute warn_unused_result. This is specified as:

int getTemperature( int sensor) 
__attribute__ ((warn_unused_result));

The compiler subsequently generates a warning message during the compile stage for any caller that does not use the return value.

Warning of NULL Function Parameters. Using a function attribute and a compiler option, you can instruct the compiler to warn you if a function is passed a NULL parameter. You use the nonnull function attribute and argument list to specify which parameters may not be NULL. In this example, NULL may not be passed for the first or second parameter:

int sendPacket( void *header, void *payload, int payload_len )
   __attribute__ ((nonnull (1, 2)));

For this check to be made, the warning option -Wnonnull must be enabled in the compiler.

Mapping Functions to Sections. By default, all functions are mapped into a section called text. It's sometimes necessary to create new sections into which functions can be mapped. One example in the embedded domain is the mapping of performance-path functions to cached memory, and nonperformance functions to uncached memory. The first step is identifying a section into which these functions will be placed. This is done with the section function attribute:

int routePacket( packet_t *packet )
   __attribute__ ((section ("fastpath")));

This places the function routePacket into a section called fastpath. The GNU linker can then be used to map this section to a specific memory region with the needed attributes.

Mapping Variables to Sections. You can also change the default section for variables, as demonstrated for functions. While the compiler will either place a variable in the data section or the bss (uninitialized data) section, there are some cases when you need to provide further mapping. For example, if the data is used in the performance path, you will want to map this to a cached region. For data that is used by a DMA engine, you'll need an uncached region. Mapping variables to sections is similar to function mapping:

taskList_t *taskList
   __attribute__ ((section ("cached"))) = (taskList_t *)0;

Note that in this example, the variable initialization follows the attribute specification. You then rely on the linker to place these sections at their appropriate memory HASH(0x80bdec) using the linker script.

Function Hooks

GCC can insert hooks into an application for a variety of purposes. I look at three such uses here.

Instrument Functions. An interesting GCC extension is the selective instrumenting of functions to identify their entry and exit points at runtime. This can provide an address call trace of a running application and, with some additional tooling, function name and line number information.

First, GNU provides hooks to capture whenever an instrumented function is called or exits with these prototypes:

void __cyg_profile_func_enter 
   ( void *func_address, void *call_site );

void __cyg_profile_func_exit  
   ( void *func_address, void *call_site );

In each case, the func_address argument is the address of the function that is being entered or is exiting. This address can be found in the map file for the given executable. The call_site variable is the address from which the function was called (for _enter) or the address from which the function returns (in the _exit case). For purposes here, you can use the func_address for a simple call trace.

To enable function instrumenting, you compile your source files with the -finstrument-functions as (along with -g to ensure that debugging data is present):

gcc -g -o test test.c -finstrument-functions

You can define which files have instrumentation and which do not by providing or omitting the instrument-functions option. You can also selectively disable instrumentation for a function in a file for which instrument-functions has been specified; see Listing 4.

The first thing to note in Listing 4 is the attribute specification of the profiling function. In this case, you instruct the compiler not to instrument this function (which would result in recursive profiling). The profiling function (only for the entry case) simply emits the address of the function to stdout. Next, you create a few functions to illustrate a sample trace.

If you now run this application, the result is a stream of addresses printed to stdout, which isn't entirely useful. You can increase its value by using the addr2line utility, a useful utility that takes an image and an address and converts it to a function name and source-line number. To take the stream of addresses and use it with the addr2line utility, use xargs to direct the output of the application as the command-line arguments to addr2line; see Listing 5. In Listing 5, you now see the output generated by addr2line. Each address that was emitted by the instrumented application is translated into a function name and the source line in the file for the function. This represents the call trace for a sample application that was enabled with a simple function and a single flag. With just a little more work, a call tree could also be generated.

Wrapping Functions. Using GNU's builtin functions, you can wrap an existing function call with your own function, while preserving the argument list and return value. Three builtin functions provide this capability:

void *__builtin_apply_args();

void *__builtin_apply( void (*func)(), 
    void *args, size_t size );

void *__builtin_return( void *result );

The __builtin_apply_args function returns a void pointer of the argument list passed into the function. The __builtin_apply function applies the arguments saved in args (from __builtin_apply_args) and passes them to function func. The size parameter is used to compute the amount of data that is pushed on the stack. Finally, __builtin_return returns the value from __builtin_apply to the original caller.

Listing 6 is an example of this in which the printf call is wrapped by a new function called newprintf. Function newprintf accepts a format string and a variable argument list (indicated by the ellipses "..."). Using the builtin functions, you identify the argument list, apply it to the function that you're wrapping (printf), and then adjust for the return value of the wrapped function to the original caller.

Main Function Constructor/Destructor. You can provide constructor- and destructor-like functions for main functions using C extensions. These are provided by two special function attributes called constructor and destructor. By applying the constructor attribute to a function, the function is called before the main function of the C program. Conversely, with the destructor function attribute, upon exit of the C application, the destructor function is called. These functions can be created as:

void myConstructor( void ) __attribute__ ((constructor));

void myDestructor( void ) __attribute__ ((destructor));

Conclusion

The extensions I discuss here can be useful in developing C applications on GNU systems. In fact, you can find their use sprinkled throughout the Linux kernel. Care should be taken when portability is important because they may not be available in other compilers. To ensure an application's portability, the compiler flag --pedantic can be specified to warn if any nonstandard features are used.

Resources

"Using the GNU Compiler Collection (GCC)," Free Software Foundation (http://gcc.gnu.org/onlinedocs/gcc-3.4.3/gcc/).

May, 2005: GNU's C Language Extensions

Listing 1

#define swap( x, y )		   \
        ({ typeof(x) temp  = (x);  \
           x = y; y = temp;	   \
        })

May, 2005: GNU's C Language Extensions

Listing 2

char ch;
 ...
switch( ch ) {
  case 'a' ... 'z':
    printf("lowercase\n"); break;
  case '0' ... '9':
    printf("number\n"); break;
}

May, 2005: GNU's C Language Extensions

Listing 3

typedef struct {
  int len;
  char data[0];
} payload_t;

payload_t *getPayload( int len )
{
  payload_t *payload = (payload_t *)0;

  payload = (payload_t *)malloc( sizeof(payload_t) + len );
  if (payload) payload->len = len;

  return payload;
}

May, 2005: GNU's C Language Extensions

Listing 4

 1:  #include <stdio.h>
 2:
 3:  void __cyg_profile_func_enter( void *, void * )
 4:         __attribute__ ((no_instrument_function));
 5:
 6:  void __cyg_profile_func_enter(void *this, void *callsite)
 7:  {
 8:    printf("%p\n", (int)this);
 9:  }
10:
11:
12:  void func_c( void )
13:  {
14:    return;
15:  }
16:
17:  void func_b( void )
18:  {
19:    func_c();
20:
21:    return;
22:  }
23:
24:  void func_a( void )
25:  {
26:    func_b();
27:
28:    return;
29:  }
30:
31:
32:  int main()
33:  {
34:    func_a();
35:    func_c();
36:  }

May, 2005: GNU's C Language Extensions

Listing 5

$ ./instrument | xargs addr2line -e instrument -f
main
/home/mtj/gnu-ext/instrument.c:33
func_a
/home/mtj/gnu-ext/instrument.c:25
func_b
/home/mtj/gnu-ext/instrument.c:18
func_c
/home/mtj/gnu-ext/instrument.c:13
func_c
/home/mtj/gnu-ext/instrument.c:13
$

May, 2005: GNU's C Language Extensions

Listing 6

 1:  #include <stdio.h>
 2:
 3:  int newprintf( char *fmt, ... )
 4:  {
 5:    void *args, *ret;
 6:
 7:    args = __builtin_apply_args();
 8:    ret = __builtin_apply( (void *)printf, args, 1024 );
 9:
10:    __builtin_return( ret );
11:  }
12:
13:  int main()
14:  {
15:    newprintf( "A %s of the new %s function.\n",
16:                "test", "printf" );
17:    return 0;
18:  }

Terms of Service | Privacy Statement | Copyright © 2024 UBM Tech, All rights reserved.