Dr. Dobb's | About Log Files: Part 1

About Log Files: Part 1

Just because log files seem boring doesn't mean they're not important

January 27, 2009
URL:http://www.drdobbs.com/cpp/about-log-files-part-1/212902973

Stefan is a software developer in Berlin specializing in cross-platform development, user interfaces, and reengineering. He can be contacted at [email protected]

Every program should write a log file. Period. Log files will save you hours and sometimes days of tedious debugging work. They are a basic part of structured debugging and quality assurance. The idea might seem odd in these days of application servers, SOA, and the like, but log files are the most efficient and flexible way to debug software and analyze problems. In the first installment of this two-part article, I examine the basics of logging and log files. In Part 2, I'll examine specialized techniques that you can use for multithreaded application or large data sets.

Why Log Files?

In a perfect world, we wouldn't need customer support, debuggers, or log files. In the real world, the more complex things get, the more difficult problems arise. Some will be seen and fixed before a program is in production/released. Others will show up much later and raise thier heads rarely. That's where log files come in.

A log file has several benefits for analyzing a software problem:

It's simple to create and efficient to write
It doesn't lie or forget. It's robust.
It can be transmitted easily by email. It's flexible

When problems arise in software, tech suppoort will usually receive a call or a email about it. It is amazing, how bad, how little, and how confused the information from customers will be about what they have done, what happened, and what the problem is. If your application writes a log file, even if it only contains the used data sources and the executed commands, you will certainly notice that this information often differs from the information relayed by the customer. This will save time and provide clarity--In the same way that a flight recorder often provides the only valid information source to analyze aircraft crashes.

Tech support people can also learn to read and interpret log files and thereby reduce the number of times they need to contact developers. But most importantly: Log files help you in finding bugs that you can't reproduce. Debugging literature general states " make it fail" as a major step (see Chapter 4: Make it Fail in Debugging, by David J. Aagans. Ifyou can't make it fail, then log files are your last resort.

When and Where Do You Store Log Files?

When creating log files you should consider several conditions:

An application should open the log file at startup and close it just before termination.
It's important to make sure that a program has write access to the log file's location on all supported platforms. Windows Vista, for instance, is much more restrictive about that than previous versions. The log file should be easy to find and have a name that is self explanatory, such as "AppName".log. The best location is always a folder named "AppName"Logs\ wherever its location is appropriate.
You might come up with the idea of creating separate log files for separate modules or for different purposes ( a standard log and debugging log, for example). Generally speaking, this is not a good idea. You'll find yourself searching in multiple log files, comparing timestamps, and so on. It's best to use only one log file at a time. That said, separate instances of an application should use separate files as long as they operate independently.
You might also come up with the idea to log not to a file but to a database. This seems high-tech, but it directly contradicts the key concepts of simplicity, flexibility, and robustness.

What to Log

The first entry in a log file should be something like "starting log" and the last entry should be a statement like "closing log file." If this entry is missing, you know that the application terminated unexpectedly.

After this first entry, some general information should be recorded:

Operating system version, patches.
Size of RAM installed, size of free RAM available, and free space on hard drives.
IP address, machine name, username, and so on.
Current date and time
The version of your program.
The version of components, DLLs, and libraries that you use.

Logging Levels

After having written general information, it's a good idea to be able to control the amount of data written to a log file. This should be done by a "logging level" parameter in an application, that can be modified at runtime. If your program is an application with a GUI, it should contain such a list box, where users might be directed to tech support. Each log statement must belong to one logging class. The statement is only processed if the currently selected logging level is higher than or equal to the logging class of the statement.

Logging classes can include:

Logging Class 0: Basic
- Data sources/documents/connections opened and closed
- Size or number of items in opened documents
- Commands executed
- User or requester information
- All messages shown to users
- All answers/choices users have made
Logging Class 1: Extended
- Information from certain important functions called, such as executed or passed program paths.
- Information about received and processed events/requests
- All status information shown to users
Logging Class 2: Debug
- Information from within loops for all iterations
- Extensive data dumps
- Additional debug information

enum LogClasses
{
    LOG_CLASS_BASIC     = 0,
    LOG_CLASS_EXTENDED  = 1,
    LOG_CLASS_DEBUG     = 2
};
int   gCurrentLogLevel = LOG_CLASS_EXTENDED;
FILE *gLog file = NULL;
//////////////////////////////////////////////////////

void  MyLogFunc(int LogClass, char *str, ...)
{   char tx[512], tx2[512];
    va_list marker;
    if(LogClass > gCurrentLogLevel)
        return;
    fprintf(gLog file, "%s %s ", _strdate(tx), _strtime(tx2));

    if(!IsMainThread())
    {
        fprintf(gLog file, "THREAD %4d: ", GetThreadID());
    }
    va_start(marker, str);
    vfprintf(gLog file, str, marker);
    va_end(marker);
}
///////////////////////////////////////////////////////
void testFunc()
{
    MyLogFunc(LOG_CLASS_BASIC, "This text will be logged allways\n");
    MyLogFunc(LOG_CLASS_EXTENDED, "This text will be logged only, "
                      "if gCurrentLogLevel >= LOG_CLASS_EXTENDED\n");
}

Listing One: Logging classes and functions

By default, an application should use the basic level. Logging should never be switched off. The other levels should be enabled on demand by tech support. Users can then reproduce a problem and generate detailed logging information. It should also be possible to switch to extended level logging permanently. This level is reasonable for your test department and for power users.

The debug level might be exhaustive and it may slow down a program enormously -- but it might also contain everything needed to identify a problem. Therefore, the tradeoff is sometimes quite worthwhile.

Additional Information

Log entries should start with a full date/timestamp. Logging output related to certain elements should contain the element's ID. In situations where problems arise in some areas of code without any indication of the real cause, it's a good idea to spread more log statements around where it was first seen in order to get close to the problem. Such log statements can contain as little as the source file name, line number, and the current value of a variable.

Logging Program Faults

Most importantly, log files must record information on errors: Positive asserts, access violations, C++ or Java exceptions. Logging asserts implies that you write your own assert macros (or assert functions in Java) and that you keep the asserts in released versions. A custom assert macro will log the error, log a stack unwind, and then terminate or proceed as appropriate.

Access violations can be caught and logged by an OS-provided interrupt handler. C++ exceptions also can be logged at the point where the fault actually happened (at least this can be done with some compilers; see Visual C++ Exception-Handling Instrumentation. In Java, the stack unwind is free.

All of these errors should lead to a common function that logs the related text, logs a stack unwind, and then does whatever is reasonable (terminate, try to continue, and so on). Another question is how to proceed with a log file that now contains such an error. The file must somehow be saved and users should be informed that an error log has been generated that should be transmitted to tech support. For more discussion on these topics, see Postmortem Debugging.

#define MYASSERT(x)\
  if(!(x)){MyErrorFunc(#x, 0, "", __FILE__, __LINE__); return  ;}

#define MYASSERT2(x, i1)\
  if(!(x)){MyErrorFunc(#x, i1, "",__FILE__, __LINE__); return  ;}

#define MYASSERT3(x, i1, s1)\
  if(!(x)){MyErrorFunc(#x, i1, s1, __FILE__, __LINE__); return  ;}

Listing Two: Assert macros.

void    MyErrorFunc(char *Name, int Val, char *Str, char *File, int Line)

{     char tx[512], tx2[512];
      // Record that my Application exited with Error
      SetWeDidExitWithError();
      // Log the Error encountered
    MyLogFunc(LOG_CLASS_BASIC,
                "\n*************************************************\n"
                 "*** Internal Error: \"%s\" (%s, %d)\n"
                 "*** File \"%s\" Line %d\n"
                 "*** Appl Vers.: %-12s             ***\n"
                 "*** Time & Date: %-12s %-12s      ***\n",
                 Name, Val, Str,
                 File, Line,
                 myApplVersion,
                _strdate(tx), _strtime(tx2));

    // Log The Memory State and other boundary conditions
      MyLogFunc(LOG_CLASS_BASIC, MemoryInfo(tx));
    // Inform User about the Error
      MyDialog("Internal Error: \"%s\" (%s, %d); \nFile %s, \nline %d",
                Name, Val, Str, File, Line);
      // Possibly you may try to continue execution here
    exit(-1);
}
///////////////////////////////////////////////////////
void testFunc2()
{
    int x = 0, y = 2;
    MYASSERT(x == 1);
    MYASSERT2(x == 1, y);
    MYASSERT3(x == 1, y, "x must be 1 in testFunc2");
}

Listing Three: Assert function

Formatting Log Output

In general, there's little need to internationalize logging output. It's only made for the programmer and tech support and probably should be in English. It is good practice to hardcode log statement strings in sources even if an application GUI is internationalized. It makes it easier to maintain and find entries that you see in a log file.

Although internationalization and externalization of log strings is not useful, it's as important to use proper formatting, indenting, aligning, and so on, as it is in source code. Unstructured logging information is much less useful and makes it harder to find the needed information. Breaking longer lines is not necessarily an improvement, because successive lines will be invisible when searching the log file.

Logging Binary Information

Sometimes you manage data with content definitions that you don't know. For example, your program might store information about the printer used, but some or all of this data might be a typeless array. However, the array will contain the name of the printer and some of its settings in readable text. If some customers report printing problems, you may want to know which printers they use. You can then print all human-readable text of the array to your log file and skip the rest.

void DebugBinary(void *p, int len)
{  
    char *tx = new char[len];
    memcpy(tx, p, len);
    for(int i = 0; i < len; i++)
        if(! isprint( (unsigned char)tx[i] ))
                     tx[i] = '.';
    MyLogFunc(LOG_CLASS_BASIC, tx);
    delete [] tx;
}

Listing Four: Logging binary information

How To Call Logging Functions

In general, it is a bad idea to use native calls like fprintf for each log statement. Functions that are called frequently should always be encapsulated. The overhead for log files -- if inserted with care -- is negligible. I program in C++ and strongly go for theprintf style (see StringPrintf: A Typesafe printf Family for C++. You can use variable argument lists or write a typesafe stringprintf-like interface to create such a function.

A Log function should have at least two parameters:

The logging level, that belongs to the log statement
The string to log

A better technique might group logging and notification to the user with user decisions all together in one interface. Such a function would not only contain a logging level, but also a bitfield containing several flags:

Show a Dialog with the text string passed?
Show a Abort/Continue Dialog with the text string passed?
Add the Text to a Function specific protocol?
The logging level in terms of bitfield flags?

typedef enum
{
    LOG_BASIC       = 0x0001,
    LOG_EXTENDED    = 0x0002,
    LOG_DEBUG       = 0x0004,
    ALTERNATE_LOG   = 0x0010,
    CONFIRM_DLOG    = 0x0100,
    ASK_CANCEL_DLOG = 0x0200
    // ...
};
int MyMessageFunc(long Attribs, char *string)
{
    if(Attribs & (LOG_BASIC | LOG_EXTENDED | LOG_DEBUG))
      {
        int logClass = (Attribs & LOG_DEBUG   ) ? LOG_CLASS_DEBUG :
                       (Attribs & LOG_EXTENDED) ? LOG_CLASS_EXTENDED :
                        LOG_CLASS_BASIC;
        MyLogFunc(logClass, string);
      }
    if(Attribs & ALTERNATE_LOG)
        LogToAlternateFile(string);
    if(Attribs & CONFIRM_DLOG)
        MyDialog(string);
    if(Attribs & ASK_CANCEL_DLOG)
        return MyAskCancelDlog(string);
    return 0;   
}
/////////////////////////////////////////////////////
void testFunc3()
{
    MyMessageFunc(LOG_BASIC,
           "This text will just be logged");
    MyMessageFunc(LOG_BASIC | CONFIRM_DLOG,
            "This text will be logged and shown in a dialog");
    int result = MyMessageFunc(LOG_BASIC | ASK_CANCEL_DLOG,
            "This text will be logged and displayed in a OK/CANCEL Dialog."
            "The result is the choice made by the user");
}

Listing Five: A Generalized Message and Log Handling

To keep the performance impact low, you should check the logging level in such a function before you process the string any further. You will find that in some cases (mostly for Logging Class 2 calls), it is better to check the logging level explicitly before even calling the logging function or constructing string objects. Building C++ or Java strings is expensive. And the strings will be built, even if the called logging function does not do any processing. In the case of a variable argument list for C-Strings, string processing happens in the logging function itself and is skipped if the logging level skips processing.

Back in the day, it was important to call something like fflush after each statement, or the last (and most important) lines would be missing from the log file when the application crashed. But that no longer seems to be the case with modern operating systems.

User Interface and Parameters

Your application should provide some logging-related options to users via menus or preferences:

The Logging level. It should be editable by users during runtime. Put it somewhere that even the dimmest users will find it. A good place for this seems to be the About dialog.
"Open Log files Folder" or "Send Log File." We are continually amazed by how many users have serious problems navigating through a folder structure even when guided via telephone. They just don't find the file needed. Windows, by default, hides the file name suffix, which makes it even more difficult.
Sometimes its useful to define a alternative location name for the log file; for instance, when several applications should log to a common folder.

Conclusion

Log files may seem uninteresting/boring and thus get ignored by many programmers, but this is a huge mistake. Log files are simple to make and use and can be enormously helpful. In Part 2, I'll show more specialized logging techniques for multithreaded applications or large data sets.