Channels ▼


Testing Python and C# Code

Because RabbitMQ was a new third-party piece of software to be used as a critical component of our system, I wanted to test its integration throughly. That involved multiple tests against a local cluster of three nodes (all running on my local machine), as well as the same tests running against a remote RabbitMQ cluster. The tests involved tearing down, recreating, and configuring the cluster in different ways, and then stress-testing it. Setting up and configuring a remote RabbitMQ cluster involves multiple steps, each normally taking less than a second. But, on occasion, one can take up to 30 seconds. Here is a typical list of the necessary steps for configuring a remote RabbitMQ cluster:

  • Shut down every node in the cluster
  • Reset the persistent metadata of every node
  • Launch every node in isolated mode
  • Cluster the nodes together
  • Start the application on each node
  • Configure virtual hosts, exchanges, queues, and bindings

I created a Python program called Elmer that uses Fabric to remotely interact with the cluster. Due to the way RabbitMQ manages metadata across the cluster, you have to wait for each step to complete for every node in the cluster before you can execute the next step; and checking the result of each step requires parsing the console output of shell commands (yuck!). Couple that with node-specific issues and network hiccups and you get a process with high time variation. In my tests, in addition to graceful shutdown and restart of the whole cluster, I often want to violently kill or restart a node.

From an operations point of view, this is not a problem. Launching a cluster, or replacing a node, are rare events and it's OK if it takes a few seconds. It is quite a different story for a developer who want to run a few dozen cluster tests after each change. Another complication is that some use cases require testing unresponsive nodes, which can lead to the halting problem (is it truly unresponsive or just slow?). After suffering through multiple test runs where each test was blocked for a long time waiting for the remote cluster, I ended up with the following approach:

  1. Elmer (the Python/Fabric cluster remote control program) exposes every step of the process
  2. A C# class called Runner can launch Python scripts and Fabric commands and capture the output
  3. A C# class called RabbitMQ utilizes the Runner class to control the cluster
  4. A C# class called Wait can dynamically wait for an arbitrary operation to complete

The key was the Wait class. The Wait class has a static method called Wait.For() that allows you to wait for an arbitrary operation to complete until a certain timeout. If the operation completes quickly, you will not have to wait for the time to expire, and Wait will bail out quickly. If the operation doesn't complete in time, Wait.For() will return after the timeout expires. Wait.For() accepts a duration (either a TimeSpan or number of milliseconds), and a function returns bool. It also has a Nap member variable that defaults to 50 milliseconds. When you call Wait.For(), it calls your function in a loop until it returns true or until the duration expires (napping between calls). If the function returns true, then Wait.For() returns true; but if the duration expires, it returns false.

public class Wait
        public static TimeSpan Nap = TimeSpan.FromMilliseconds(50); 
        public static bool For(TimeSpan duration, Func<bool> func)
            var end = DateTime.Now + duration;
            if (end <= DateTime.Now)
                return false;
            while (DateTime.Now < end)
                if (func.Invoke())
                    return true;

            return false;

        public static bool For(int duration, Func<bool> func)
            return For(TimeSpan.FromMilliseconds(duration), func);

Now, you can efficiently wait for processes that may take highly variable times to complete. Here is how I use Wait.For() to check whether a RabbitMQ node is stopped:

private bool IsRabbitStopped()
            var ok = Wait.For(TimeSpan.FromSeconds(10), () =>
                var s = rmq("status", displayOutput: false);
                return !s.Contains("{mnesia,") && !s.Contains("{rabbit,");

            return ok;

I call Wait.For() with a duration of 10 seconds, which I wouldn't want to block on every time I check whether a node is down (since it happens all the time). The anonymous function I pass in calls the rmq() method with the status command. The rmq() method runs the status command on the remote cluster, then returns the command-line output as text. Here is the output when the Rabbit is running:

Status of node [email protected] ...
     [{rabbitmq_management,"RabbitMQ Management Console","2.8.2"},
      {xmerl,"XML parser","1.3"},
      {rabbitmq_management_agent,"RabbitMQ Management Agent","2.8.2"},
      {amqp_client,"RabbitMQ AMQP Client","2.8.2"},
      {os_mon,"CPO  CXC 138 46","2.2.8"},
      {sasl,"SASL  CXC 138 11","2.2"},
      {rabbitmq_mochiweb,"RabbitMQ Mochiweb Embedding","2.8.2"},
      {mochiweb,"MochiMedia Web Server","1.3-rmq2.8.2-git"},
      {inets,"INETS  CXC 138 49","5.8"},
      {mnesia,"MNESIA  CXC 138 12","4.6"},
      {stdlib,"ERTS  CXC 138 10","1.18"},
      {kernel,"ERTS  CXC 138 10","2.15"}]},
 {erlang_version,"Erlang R15B (erts-5.9) [smp:8:8] [async-threads:30]\n"},

The function is making sure that the mnesia and rabbit components don't show up in the output. Note that if the node is still up, the function will return false and Wait.For() will continue to execute it multiple times. Wait.For() decreases the sensitivity of my tests to occasional spikes in response time (I can Wait.For() longer without slowing down the test in the common case), and has reduced the runtime of the whole test suite from minutes to seconds.


The sum total of this series of articles has shown a variety of design principles and testing techniques to deal with hard-to-test systems. Nontrivial code will always contain bugs, but deep testing is guarantied to reduce the number of undiscovered issues.

Gigi Sayfan specializes in cross-platform object-oriented programming in C/C++/ C#/Python/Java with emphasis on large-scale distributed systems, and is a long-time contributor to Dr. Dobb's.

Related Articles

Testing Complex Systems

Testing Complex C++ Systems

Related Reading

More Insights

Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.