In part one of this three-part series on deep testing complex systems, I covered the practical aspects of deep testing and demonstrated how to test difficult areas of code, such as the user interface, networking, and asynchronous code.
- Book Expert: Advanced Analytics with Spark: Patterns for Learning Data at Scale
- Coding to standards and quality: supply-chain application development
- How to Stop Web Application Attacks
- Big Data and Customer Interaction Analytics: How To Create An Innovative Customer Experience
In this article, I discuss some techniques and utilities I have used to successfully test complex systems in C++. In the final article next week, I'll discuss similar complex testing for Python (with Swig C++/Python bindings) and high-volume, highly available C#/.NET Web services.
Obstacles to Testing Complex C++ Code
C++ arguably poses the most difficult challenges for testing among the major industrial-strength programming languages and environments. There are several challenges to deal with: Build speed is slow; the language uses weak types; and C++ programmers love twisting code or template-based designs to extract every last bit of performance. Let's look at how these factors affect testing complex code.
Slow Build Speeds
C++ projects (especially cross-platform projects) are often built as one big monolith. They can comprise large numbers of static project libraries along with many more third-party libraries that are all linked together to generate the final executable (often, several executables).
The result of this monolithic construction is less than stellar dependency management. I've seen several large C++ codebases where every merge was a project in itself involving a dedicated build engineer to convene multiple developers to help resolve conflicts and build the software, sometimes requiring multiple building passes to combat cyclic dependencies. Add the tools used to build many cross-platform C++ projects (such as make) and you get a picture of slow, very complex builds.
The impact of this on testing is profound. When simply building the software is such a chore, the tendency is to minimize the number of artifacts you generate. This leads to one of several unpleasant choices:
- No tests just build the software and play with it; trust your QA team to find all the bugs.
- Add some test support to the main software with compile-time or runtime switches.
- One big test executable (or test framework) that links against all the dependencies and contains all the tests.
- Many small test executables, each linked against its dependencies.
Option #1 is an obvious no-go from serious testing point of view (but very common in practice).
Option #2 is cumbersome. It doesn't let you to perform isolated testing and complicates the final product with testing code and dependencies.
Option #3 requires a slow build of the huge test executable whenever you make a small change (either to the code under test or to the test itself).
Option #4 requires a lot of tinkering around. It's pretty good when you work on an isolated bug or feature and has to build just one small test executable, but when you want to run a suite of regression tests, you will have to wait a long time for the build because each and every small executable will have to link against all its dependencies, which will duplicate a lot of effort (especially for common libraries used by every test executable).
The Quest for Performance
C++ was designed for raw performance. You pay only for what you use. It is compatible with C, which started as a glorified assembly language. In later years, C++ became a multi-paradigm language supporting procedural, object-oriented, and generic programming styles, while still maintaining its C compatibility and performance. However, this led to a programming language that is an order of magnitude more difficult to learn and practice than other mainstream languages. In its glory days (early '90s), C++ was used for everything, especially on Windows. You wrote your number-crunching code and your user interface and your networking code in C++, and that was that. The rise of the Web and Java changed everything. Suddenly, the programming language of your project wasn't a given and you could even mix and match programming languages in the same project. Later, hardware got faster, other languages got better (C#, Python), and Web applications where developed in every possible language except C++ (anyone remember ATL server?). That relegated C++ to the engine to do what it does best fast processing. C++ programmers (unless they are polyglot and dabble in other languages) justifiably focus more than anything on generating fast and tight code. Many great C++ developers formed their habits before automated unit testing and the agile movement made it to the big stage. Also, because performance is very often a system-wide property and not just the sum of the performance of the various components, its pursuit leads to a dependence on the usefulness of unit testing.
Very Weak Type System
C++ is a statically typed language, but its type system is very weak. You can freely cast any pointer to any other pointer using
reinterpret_cast. It can perform implicit casts/coercions/promotions on your behalf in certain situations. In addition, it has the legacy C preprocessor that allows inventing private sub-languages in a non-typesafe way. Add to that powerful templates with no constraints. Because all these features are heavily used in industrial strength programs, you end up with a combustible concoction that is pretty difficult to reason through and to feel secure that you have covered in your tests all the ways the code can misbehave.