Channels ▼
RSS

C/C++

About Log Files: Part 2


Stefan is a software developer in Berlin specializing in cross-platform development, user interfaces, and reengineering. He can be contacted at StefanWoe@gmail.com


In Part 1 of this two-part article about log files, I presented log file basics. You've seen how to write log files and how to use them. But how can log files be used in more complex situations? In this installment, I examine how you can use logging with large data sets and how to find rarely occurring bugs. You'll also see how to use log files in threaded or multi-tiered applications.

Rotating Log Files

For server applications, it is important that you don't fill the entire disk. However, it is equally important to keep information for a long time because the original cause of problems might be recorded. In such cases, rotating log files are a proven technique: You create a new log file after a certain amount of timevor when a certain size is reached, then only keep the last couple of log files. The filenames should not just increase an indexed suffix number in a round-robin fashion, but rather include the date and time of creation in a sortable format. This makes it easier to identify the "right" log file. Whenever a new file is created, the oldest files are deleted by a defined policy; for example, keep the last 10 log files, or those of the last three or so days.

Ring Logging

I once had a problem with an assertion that very rarely failed very deep in the database layer of my application. A class function was called in a class state where it wasn't meant to be called. This only happened at my client's machines, and I wasn't able to reproduce the bug -- not even once. This is where ring logging comes in.

Ring logging is implemented via the following steps:

  1. Create an internal array with a fixed number of string buffers (preferable c-Strings for good performance).
  2. Create a integer variable holding the last string used in the array.
  3. Create a function that adds a String to the array after the last used index. If the index points to the end of the array, than restart with the first element and just overwrite it -- but keep all other elements.
  4. Create a function that dumps all entries to your log file, starting from the oldest entry (which usually is the next array element to overwrite) up to the latest entry.

 /////////////////////////////////////////////////////////////////////

// Single Ring logging Record
class CRingLogEntry
{
public:
    void Set(char *File, int Line, char *String)
    {
        memset(this, 0, sizeof(CRingLogEntry));
	    strncpy(mFileName, File , sizeof(mFileName));
	    mLine = Line;
	    strncpy(mText    , String, sizeof(mText));
    };
    void Dump()
    {
        MyLogFunc(LOG_BASIC, "%s %d: %s", mFileName, mLine, mText);
    }

    char mFileName[255];
    int  mLine;
    char mText[255];
};
/////////////////////////////////////////////////////////////////////////
// RingBuffer and Counter
#define LOGENTRYCOUNT 255
CRingLogEntry gRingBuffer[LOGENTRYCOUNT];
static int    gCurrentEntryNo = 0;
/////////////////////////////////////////////////////////////////////////

// Add a Entry to RingBuffer
void AddRingLogEntry(char *File, int Line, char *text)

{
    gRingBuffer[gCurrentEntryNo].Set(File, Line, text);
    gCurrentEntryNo ++;
    if(gCurrentEntryNo >= LOGENTRYCOUNT)
       gCurrentEntryNo = 0;
}

// Dump All entrys
void DumpRingLogBuffer()

{
    for(int i  = gCurrentEntryNo, turned = 0; 
            i != gCurrentEntryNo || ! turned; i++)
    {
        if(i >= LOGENTRYCOUNT)
        {
           i = -1; 
           turned = 1;
           continue;
        }

        gRingBuffer[gCurrentEntryNo].Dump();
    }
}
//////////////////////////////////////////////////////////////////////////
#define ADDRINGENTRY(x)\
  AddRingLogEntry(__FILE__, __LINE__, #x);

void testFunc4()
{
    for(int i = 0; i < 10000; i++)
    {
        char tx[255];
        sprintf(tx, "i = %d", i);
        ADDRINGENTRY(tx);
    }
    // This will show the last entries from 9745 to 10000
    DumpRingLogBuffer(); 
}

Listing One: Logging classes and function

Listing One implements a ring buffer, along with macros that simplify the creation of log statements and that automatically added file and line numbers. I added these macros to most of the class members and inserted the call to the log dump function right before the failing assert -- calling the dump only if the asserted condition was True. I shipped the new version to customers who had reported the error and waited. After a while I received a log file that more closely showed what was happening -- but still I wasn't able to reproduce or fix the problem. However, I did have a closer view of where the problem was hidden. With this knowledge I added more ring log statements and shipped another version. When I received the next log file I saw within a few minutes what the problem was -- and I was able to reproduce it on the spot and fix it right away. And I had spenet only a hour (or less) for the log entries. This is why I say that ring logging is something like the secret service of programming.

Working with Large Datasets

A single log file is good, but what do you do with 100MB of them? You may just browse them, usually starting from the end of the last log file. However, the information you are looking for might be spread over several log files. When analyzing a large number of large log files, use grep or a similar search over several files. Searching for all occurrences of a specific ID will probably reduce the amount of data to a small number of lines.

For example, one application I maintain simulates hydraulic networks. Sometimes a problem occurs only in networks with hundreds of thousands of elements. In such situations I use a function that writes the state of all elements to a log file at certain states (i.e., when entering and leaving the simulation code). I only do this in my debugging version by uncommenting some calls. Of course, it then takes time to log the information for all elements processed. But afterwards I just search for all log entries referring to the element where a problem exists. I might add additional statements and rerun the program. But all that is much more feasible than using a debugger. Debuggers are great, but when working with huge amounts of data it sometimes advantagous to be able to search forward and backwards in a log file, to search for the state of related elements, to double check things again, and so on. This is easily done with a text file. You can't browse or search the execution history of your program in a debugger. You'll have to restart the whole program, which can take a while.

Using less, grep, and tail

Of course, you need good tools to analyze large log files. UNIX programmers are familiar with tools such as grep, less, or tail. They're also available for Windows(see http://unxutils.sourceforge.net/) and arelightweight, fast, and have no size limit. Still, you need a little practice to work with them:

  • Use less (not "more") to view a file and search in it.
  • Use to search through many files and get a list of all matches (type grep --help to see all options).
  • Use tail "f to view the last lines of a log file. Using the -f switch will keep tail alive and successively display all added lines as soon as they are written to the log file. This gives a program a "console" through a log file. Sometimes a relief.

While you can use Notepad or other editors, large files may be difficult. And also most editors refuse to open a log file that's still is in use. "Less" and the other "unix" commands do not complain.


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video