Channels ▼

Matthew Wilson

Dr. Dobb's Bloggers

Console width gotcha

September 09, 2010

Just when I thought it was safe for me to be let out to program in public, I go and embarrass myself yet again; it's not like the last time doesn't still zing! It's the usual story: write library function that's not readily amenable to automated testing; perform scratch tests; get inappropriate level of confidence; program confidently; build application many levels of abstraction higher; observe weird behaviour; spend whole day tracking it down; slap self on head for stupidity.

As always, when the solution reveals itself, it's blindingly obvious. Let's work backwards, since that's how I had to approach the problem.

At the moment I'm developing/enhancing a bunch of software analysis tools, to assist me in my work as an expert witness in an IT fraud case. Most of these command-line (CLI) programs. What they actually do is not germane to this post, but suffice to say they perform detailed comparative analyses on a grand scale. I've been (righteously, albeit immodestly) slapping myself on the back for being able to cross-product compare thousands of files from one codebase with thousands of files from another - millions of exhaustive file comparisons in minutes, rather than days, which is what they were taking when I started the optimisation exercise.

So, hubris-mode enabled, a couple of days ago I set about addressing one of the minor defects I was encountering in all the tools. They all make use of an as-yet-unreleased software library - CLASP (Command-Line Argument Sorting and Parsing) - for processing command-line arguments, and emitting usage information. All tools respond to the Linux-standard '--help' option, and duly output their usage information.

The problem was that sometimes I wanted to capture the usage output, either by redirecting to a file, or piping to more or another such tool, and in either case no output was evident. (You might, gentle readers, suggest that this need reveals a deeper problem: that I can't remember the usage of my own tools means either (i) that they're too complicated, and/or (ii) that I've a head like a sieve. We'll have to leave that for now, and focus on the practical problem.)

I tried numerous quick, haphazard things, without much illumination. At no time could I get the program to fail in the debugger, which is usually a portent of a difficult diagnosis ahead. In and amongst my random scribblings I did manage to elicit a diagnostic log statement with the emergency severity level, which was obviously of interest. I did this by enabling a feature that, for good reason, is disabled by default in the tools, namely using C++'s catch-all clause.

  catch(...)
  {
    pantheios::logputs(<strong>pantheios::emergency</strong>, "unexpected unknown error");
  }

The ability to catch operating system exceptions such as access violations, divide by zero, invalid handle and so on via C++'s catch-all clause is compiler-dependent. Unsurprisingly, Microsoft's Visual C++ compiler can do so, but others cannot. (IIRC, Metrowerks CodeWarrior for Windows is one such.) So, for reasons of portability it's not a good idea. More importantly, since (good) C++ software does/should not throw things not derived from std::exception, it's best to assume that any such condition is a fault, resulting from a programming error. Given the fact that one cannot assume anything once a fault is encountered, it's six of one and half a dozen of the other as to whether it's useful to catch and log, or simply to allow the process to crash. Either way, you will, hopefully, receive notification of the fault.

Reproducing the problem

Notwithstanding, with this hint of some problem at a more fundamental level, I realised I had to get all biblical on this defect. The first thing to do was create a program, help_test, that reproduced the behaviour:


 static clasp::alias_t const aliases[] =
 {
   { CLASP_ARGTYPE_FLAG, "-h", "--help", "displays help information and then quits" },
   { CLASP_ARGTYPE_FLAG, "-v", "--version", "displays the program version and then quits" },

   CLASP_ALIAS_ARRAY_TERMINATOR
 };

 void show_usage_header(FILE* stm);
 void show_usage_body(FILE* stm);

 void show_usage(FILE* stm)
 {
   show_usage_header(stm);
   show_usage_body(stm);
 }

 int tool_main(clasp::arguments_t const* args)
 {
  if(clasp::flag_specified(args, "--help"))
  {
   <strong>show_usage</strong>(stdout);
  }

  return EXIT_SUCCESS;
 }

[tool_main() is the entry point, called with the CLASP-parsed command-line representation. The details are irrelevant here.]

Remember, the problem was it worked fine when used normally, as in:

 > help_test.exe --help

Produces output including:

  . . .

  Options:

    -h
    --help
      displays help information and then quits

    -v
    --version
      displays the program version and then quits

But when redirected or piped ...

 > help_test.exe --help | tee help-output.txt

... it failed to produce any output at all!

The first thing to do was to insert some more diagnostic logging statements:


 <strong>#include <pantheios/pan.hpp></strong>

 void show_usage_header(FILE* stm)
 {
  <strong>pan::<a href="http://www.pantheios.org/doc/html/group__group____application__layer__interface____generated.html#g54a281c4cf192aea0868e8aaf5a0dd92">log_DEBUG</a>("show_usage_header()");</strong>

  . . .

  <strong>pan::log_DEBUG("~show_usage_header()");</strong>
 }
 void show_usage_body(FILE* stm)
 {
  <strong>pan::log_DEBUG("show_usage_body()");</strong>

  . . .

  <strong>pan::log_DEBUG("~show_usage_body()");</strong>
 }


 void show_usage(FILE* stm)
 {
  <strong>pan::log_DEBUG("show_usage()");</strong>

  show_usage_header(stm);
  show_usage_body(stm);

  <strong>pan::log_DEBUG("~show_usage()");</strong>
 }

 int tool_main(clasp::arguments_t const* args)
 {
  if(clasp::flag_specified(args, "--help"))
  {
   show_usage(stdout);
  }

  return EXIT_SUCCESS;
 }

This quickly showed that the problem was within show_usage_body(), which never returned when the program was piped/redirected:

 void show_usage_body(FILE* stm)
 {
  pan::log_DEBUG("show_usage_body()");

  <strong>clasp_show_body(
    NULL
  , aliases
  , clasp_show_body_by_FILE
  , stm
  , 0
 #ifdef WIN32
  , <a href="http://www.xstl.org/doc-1.9/console__functions_8h.html">winstl::get_console_width()</a>
 #else
  , 76
 #endif
  , -4
  , 1
  );</strong>

  pan::log_DEBUG("~show_usage_body()");
 }

(If you have already spotted the defect, well done. To me now it's painfully obvious, but at the time ...)

Debugging a redirected program

At this point, I figured I was definitely dealing with a defect, and that the quickest way to find it would be to debug. But how to debug a process whose output is being redirected? It's not like you can put "--help | more" into your IDE command-line; it'll just pass all three tokens along as arguments. It's not a shell now, is it?

In cases such as this, what's needed is to attach the debugger to the process after the shell's done it's redirection magic, but before the offending function is called. The way to do this is not difficult, though it is somewhat tedious.

First, we need to get the process to pause. You could insert a Sleep(10000);, but that's kind of a drag because while you're familiarising yourself to moving between the various windows you're likely to take longer than 10s, and miss it, and once you are familiar you'll find waiting nearly 8-9s for the process to continue a real drag. (Well you will if you're as impatient as me, anyway.) A better approach is to cause the program to stop as it would (in debug mode, anyway) when it hits an assertion, since this puts up a modal dialog, thereby providing a user-controlled wait.

 int tool_main(clasp::arguments_t const* args)
 {
  <strong>_ASSERTE(false);</strong>

  if(clasp::flag_specified(args, "--help"))
  {
   show_usage(stdout);
  }

  return EXIT_SUCCESS;
 }

(This uses Microsoft's C-Runtime Library's _ASSERTE() macro. Substitute as necessary for your preferred compiler/RTL.)

Next step is to attach to the debugger. This is pretty simple as well. Open Task Manager (Ctrl-Shift-Esc for fellow non-mouse-ditherers) and, from the Processes tab, select help_test.exe. Right-click, and select Debug. When it warns you and requests confirmation, click Yes.

The final step is to select the debugger. Unless you like inconvenience, carry out the first two steps with the IDE you used to create the program, with the program's project open. Then, from the list of possible debuggers, select the desired IDE instance, and bang!, you're in. All that's left is to set a breakpoint on the line after the faux assertion, and click the ignore option from the assertion dialog (which your debugged program is still showing), and Bob's your uncle.

Ok, so that's how to debug a running process in three easy steps. Let's get back to the specifics of my problem, and find the source of my shame.

Each time I stepped over show_usage_body(), and the CLASP functions it calls, clasp_show_body() and clasp_show_body_by_FILE(), the program ended silently. Although CLASP already has the ability to work with an external diagnostic logging library for the command-line parsing aspects, unfortunately that's not yet in the usage output part. Doh! Rather than go an insert stuff into the library diagnostic logging statements into the CLASP internals, I stepped deeper into the bowels of CLASP's usage, and started commenting out code. Pretty soon I found the location of the problem, if not the cause.

The problem was somewhere inside the worker function clasp_show_split_option_help_limit_width_by_FILE_() (shown with comments and chunks of code removed to try and achieve some brevity):

 static void clasp_show_split_option_help_limit_width_by_FILE_(
   clasp_diagnostic_context_t const* ctxt
 , clasp_char_t const* help
 , FILE* stm
 , int width
 , int TabSize
 )
 {
  const size_t len = clasp_strlen_(help);

  if( 0 == width ||
   (int)len <= width - abs(TabSize))
  {
    . . . /* assume infinite width, and print in a single line */
  }
  else
  {
   clasp_char_t* buffer = clasp_strdup_(ctxt, help);

   if(NULL == buffer)
   {
    clasp_show_split_option_help_limit_width_by_FILE_(ctxt, help, stm, 0, TabSize);
   }
   else
   {
    const size_t remainingWidth = (width - abs(TabSize)) - 1;
    clasp_char_t* p0;
    clasp_char_t* p1;

    p0 = p1 = buffer;

    for(; (int)len != p1 - buffer;)
    {
     if((p1 - buffer) + remainingWidth > len)
     {
      <strong>clasp_show_split_option_help_limit_width_by_FILE_(ctxt, p0, stm, width, TabSize);</strong>

      break;
     }
     else
     {
       . . . /* split line to fit width, calling self recursively */
     }
    }

    free(buffer);
   }
  }
 }

Commenting out the emboldened line made the process end normally, albeit some parts of the usage information were not output.

Eureka! moment

But I still couldn't reliably catch the issue in the debugger. So I decided to adopt a 20th-century diagnostic tool, and inserted an fprintf(stderr, ...) statement to record the parameters of clasp_show_split_option_help_limit_width_by_FILE_(). When run normally, the width was indicated as 120, the width of my shell window, in the first call (and smaller values in the limited subsequent recursive calls). When run with redirection/piping, however, the width is -1, on the first call, and the next recursion, and the next recursion, and the ... you get the point: the function can't handle negative widths, and recurses forever. Ergo, the program exit is as a result of stack exhaustion.

The cause of this is can be traced back to the use of winstl::get_console_width(), whose implementation returns ~size_t(0) if the width cannot be elicited. Almost too obvious now, isn't it?

Let's list the mistakes here:

  • Using a function, winstl::get_console_width(), that conflates a value return with a sentinel failure code, and then not checking it!
  • Passing the unsigned result to a function expecting a signed value. The result reinterpretation of the bit pattern causes ~size_t(0) to be interpreted as int(-1)! Had both been unsigned, the width would have been effectively infinite, which is what it really is in a redirected stream, and it would have worked as desired.
  • Not incorporating redirection into the automated tests. Although difficult, this is not impossible: get the test program to relaunch itself with redirected standard output stream, capturing the output of the child, and verifying its content and form.
  • Not having subjected the work to peer review.

Given the confidentiality restrictions of the work I'm doing, the last one is excusable since I'm not permitted to do otherwise. But there's no excuse for the first three.

As is often the case, it's the confluence of several mistakes and unusual conditions. I guess the lesson is, no matter how diligent one's general practice, or extensive one's experience, mistakes can still be made, and will make their presence felt in subtle ways that are difficult to reproduce and to diagnose. It would be churlish (and cowardly) of me to suggest that this instance counts as evidence of the broader challenge of dealing with complex software systems - in this case it feels more like slack practice on my part - but it certainly illustrates that there be dragons, no matter what the intentions.

Just when I thought it was safe for me to be let out to program in public, I go and embarrass myself yet again. It's the usual story: write library function that's not readily amenable to automated testing; perform scratch tests; get inappropriate level of confidence; program confidently; build application; observe weird behaviour; spend day tracking it down; slap self on head for stupidity ...

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video