A truly portable C runtime available right now.
September 01, 2003
URL:http://www.drdobbs.com/writing-portable-applications-with-apr/184401691
Anybody who has ever written a C or C++ program that must work on many operating systems has faced the same problem: not all C/C++ libraries are portable. Every platform implements POSIX functions slightly differently, and if your program needs to work on platforms other than Unix, such as Windows, the problem is made worse by the fact that those platforms have their own APIs. Although those platforms have a POSIX implementation, the native APIs are always faster and have fewer bugs than the POSIX APIs.
Programmers have developed many solutions to the portability problem through the years. The first solution is to program using strictly POSIX functions because most platforms have a POSIX layer. This solution works, but POSIX causes its own problems. For example, on Unix platforms, writing in append mode ensures that lines are always printed at the end of the file. Windows, however, has no way to do this. So, to emulate append mode in Windows, you must seek to the end of the file and write. Of course, seek and write aren't atomic operations, which would require a lock. But Windows has a solution to this problem if you use native functions. It is possible to open a file for overlapped I/O, which allows you to seek and write in a single atomic operation.
The second option is to pick a core platform and find an emulation library for all other platforms. Many programmers use Unix as their core platform and Cygwin as their emulation layer. Although this solution works, applications that run within this kind of environment are running in an emulation mode, which means they don't behave like native applications to the user, and they are usually slower than native applications.
The third solution is for the programmer to write a portability library that abstracts the differences between platforms. The problem with this solution is that writing portable code isn't easy, and it takes a long time and a lot of testing to get a robust library. Unless your goal is to create a portable runtime, you are spending time duplicating work that has already been done many times.
The final option is to find a portable and robust library that has already been written. The rest of this article discusses the Apache Portable Run-time (APR), an Open Source portability run-time that aims to solve the C/C++ portability problem. This project is covered by the Apache License, which is a BSD-like license.
The Apache Foundation developed APR as a part of creating the second version of the Apache web server. We wanted to port Apache 2.0 to as many versions of Unix as possible, as well as Windows, OS/2, BeOS, Netware, OS/390, and AS/400, and other platforms. However, we also wanted to solve all of the problems associated with Apache 1.3. To that end, APR uses native function on all platforms and only relies on POSIX when POSIX is the best option. Also, because all of the platform differences are isolated in the APR layer, the Apache code eliminates most of the #ifdefs that have caused such a maintenence problem. At its most basic level, APR is just an abstraction layer between the operating system and the application. When an #ifdef does occur in most code that uses APR, it almost always refers to one of APR's feature macros and doesn't tie the code inside the #ifdef to a particular platform.
APR is a runtime library analagous to the C runtime or the Microsoft runtime. To use APR, you must adapt your code to call APR equivalent functions. For example, if you are trying to open a file, instead of calling fopen or CreateFile, you would call apr_file_open. Of course, introducing APR functions to your code means that you are tying your application to APR, but the payoff is an application that works on more platforms. APR has functions for all of the most common operations. For a small sample of the features available in APR, see Table 1.
Although Apache was created to support the Apache web server, the Apache Foundation makes APR available to all programmers as a general purpose portability tool. To use APR in your programs, you will have to link against the APR library for your system. Currently APR developers are only distributing source code, so you will have to build APR before you can use it in your applications. APR uses a standard autoconf build system on Unix, so to build APR run the commands:
./configure; make; make installFor builds on Windows, APR has a project file that you can use in Visual Studio. Once the library is built, you need to link it into your program. This is done with the -l flag to the linker on Unix. On Windows, you will have to add APR to your project.
The rest of this article examines two standard Unix utilities re-written with APR to demonstrate how APR can solve the portability problem. These programs are not easily available on Windows as native programs. They are available as part of most Unix portability packages, such as Cygwin, but since they are not native programs, the Windows versions do not behave like standard Windows programs. The APR-based examples provide a more native implementation to the utilities. The example programs described in this article are intended for illustration purposes and do not implement all of the options found in their Unix equivalents.
After these initial setup steps, the program creates its first memory pool. If APR has a major drawback, this is it. APR was designed around a very specific memory model: pools. The idea is that memory is allocated early and reused as often as possible. This design can be a major advantage for programs that do the same operations repeatedly, because the memory usage hits a steady state and the same memory can be used repeatedly. However, pools may not work well for programs that perform many different tasks, such as games, which are constantly changing their memory usage. For a more complete description of memory pools, see the sidebar.
The next two lines open the two file descriptors that I need for this application. The first line is stdout. Most programmers are used to using stdout for this purpose, but stdout doesn't always work if you are in Windows. For example, it is standard practice in a Unix daemon to redirect stderr to a log file for easy debugging. Windows Services, however, do not have stdin, stdout, or stderr. By providing functions to access the equivalent of those file handles, APR can remove a major portability hurdle. The second file descriptor is to the file that we want to read. Unfortunately, file permissions are not well done in APR and are mostly meaningless on non-Unix platforms. This is a hard problem, and hopefully the APR developers will tackle it in a later release. Also notice that I check for success using APR_SUCCESS. This check is standard in APR; almost all APR functions return APR_SUCCESS if the function finished successfully and the exact error code if it did not. Functions that do not do this generally cannot fail and so do not return any value.
Finally, the program loops through the file, reading one line at a time and writing it to stdout. Notice that I did not close either file descriptor. The pool model lets APR applications drop file descriptors when they are no longer needed. APR applications can register cleanups to run when a pool is cleared or destroyed. When APR opens a file, socket, or any other resource, it registers a cleanup to run when the pool is cleared. For files, the cleanup closes the file descriptor. As long as pools are used judiciously, this ensures that resource leaks are rare because resources are cleaned as a part of memory management.
To use getopt, you must initialize it first using apr_getopt_init. Then you can loop, calling apr_getopt every time through the loop. As long as apr_getopt returns APR_SUCCESS, you know that the next option on the command line is an acceptable argument. In this case, I have defined f and i to be the only arguments I will accept, and neither takes an argument. After returning from getopt, you can act on the option. In this case, I am keeping track of the number of options so that I can find the list of files to delete later on. Then, depending on whether the user told us to force the delete (f), or prompt interactively (i), I set a boolean. If an unrecognized option is given, I call a simple program that reminds the user of the possible options and exits.
One quick warning about apr_getopt. Like most getopt implementations, it automatically prints an error if an illegal flag is passed to the program. For example, if -G is passed to rm, the following error message appears:
a.out: illegal option -- G ./a.out [-fi] file_nameNotice that I did not put the illegal option error message in the code.
This message can be suppressed by adding the line
opt->errfn = NULLafter the call to apr_getopt_init. This line tells APR to leave it up to the programmer to print all error messages from apr_getopt. Most people will not want to do this, because printing the error message is standard for most getopt implementations.
Now we get to the meat of the program: a simple loop that tries to delete every file in the argument list. This simple example does not allow the user to delete directories. To support directory deletion, you would need to recursively delete every file in the directory, which you could easily add using apr_dir_read, but that step is left as an exercise for the reader. In order to keep people from deleting directories though, I must know when somebody tries to do so. This can be done using apr_stat. If you are used to using the standard stat function, apr_stat will look a little strange. The strangest part is the third argument, which in this case is APR_FINFO_TYPE. The APR developers had two major goals when writing APR: portability and performance. Often those goals are in conflict; apr_stat is a good example of this. The problem is that stat is a very expensive call, and some platforms (Windows most notably) can return some information very easily, while other information requires more time. So, to balance between portability and performance, the third argument was added to apr_stat. This third argument is an OR'ed list of the type of information that you want returned. The contract from apr_stat is that it can return more information than you have asked for, but it can never return less (unless there is an error). Since all I care about is the type of the file, that is all I have asked for.
If the user does ask to delete a directory, the program prints a simple error message and continues to the next file on the command line. However, notice the error message that is printed. This is standard practice for APR applications. The APR_EOL_STR macro is defined by APR to be the correct end-of-line sequence for a given platform, so for Unix it will map to \n, but on Windows it is \r\n.
After checking for directories, the program checks if the user should be prompted before deleting the file and, if so, prompts accordingly. Assuming the user has elected to continue, the program can finally delete the file.
You'll find more information on APR, including full source code, at <http://apr.apache.org>. The website includes full documentation for all APR APIs, as well as information on how to contribute to the APR project.