Channels ▼
RSS

Shift-Left Testing


Sep01: Shift-Left Testing

Larry Smith can be contacted at larry@wildopensource.com.


Shift-left testing is how I refer to a better way of integrating the quality assurance (QA) and development parts of a software project. By linking these two functions at lower levels of management, you can expand your testing program while reducing manpower and equipment needs — sometimes by as much as an order of magnitude.

The typical development/QA cycle is often organized around so-called "base levels." Ideally, developers work for a time, then a base-level build is done, which is then passed to QA, which tests it and feeds the results back for the next round. Some number of these rounds are planned, and the project begins.

QA's role is typically limited to regression testing — that is, detecting when new base levels suddenly break code that has worked before. This is an artificial restriction, and one that can cripple an otherwise good QA group.

Furthermore, a lot of QA testing is still done manually. This is an artifact of the truism that code is often hard to test, and some degree of judgment is needed to decide if a test worked or not.

What's Wrong with This?

Management loves base levels because they are interpreted as important milestones of progress. But often the program in question can't build properly for one reason or another. Base levels can be delayed, sometimes significantly, leaving QA out of the loop when problems occur — precisely the wrong time to lose them. Development jams trying to get the base level out, while QA twiddles its fingers waiting. This is not good.

This mismatch between development and QA schedules arises often and encourages a management structure that gives QA additional work. QA frequently serves more than one master. In my experience, it is most often the primary contact point for customer-reported bugs. In theory, verifying and dispatching these problems keeps QA busy while development tries to get the next base level ready.

It is well known that the earlier a bug is detected, the easier and cheaper it is to fix. Ideally, QA would report bugs detected at each base level directly into the affected groups. In practice, however, this is seldom the case — QA's dual purpose usually forces it into a uniform reporting mechanism that more often than not reports bugs not to developers but to their managers — an extra and expensive round of indirection that prioritizes customer bugs versus development bugs. The net result is this: QA is an expensive way to find bugs. In fact, QA is the worst of all possible methods to find bugs (except for all the others, of course).

Serving dual purposes can also lead to bottlenecks when development testing demands increase as customer-problem reports do too. Since there is really no reason why these two workloads should be complementary, it's no surprise that they seldom are.

Development programmers are quite simply not good at finding bugs. Nor should they be. Consider the mindset of development programmers: They need to be good coders and good bug fixers — but if they are good at finding bugs, they should be migrating into QA. That is where they can do the most good with such a skill, after all. We should, therefore, not expect development teams to be good at finding bugs. If your management is doing its job, they won't be, almost by definition.

So the bug-finding mindset has a natural home in the QA department. The real challenge is to migrate people with good bug-finding skills into QA, but still bring them to bear much earlier in the cycle. That is where the synergy of development and QA can result in large dividends. It also goes a long way toward reducing the "second class" stigma too often associated with QA.

Whether they know it or not, that stigma is an albatross around management's neck. It arises from the perception (right or wrong) that QA engineers aren't "real" engineers, that they are just warm bodies reading scripts. Sadly that is true too often, and if it is true where you manage, you are doing yourself and your project a disservice.

Hardware resource use is another problem. Ideally, both development and QA get a complete set of all possible hardware permutations you need to support. In practice, of course, this is nearly always financially impossible. Your typical development environment is prone to that most dangerous of budgetary threats — the combinatorial explosion. A budget that has taken a hit from one of these is not a pretty sight. So we know we are going to ship untested code. How can we minimize our exposure?

QA is not just about regression testing. While regression testing is useful, it should not be the main focus of QA. Rather, QA should detect bugs as soon as possible and then add tests for that bug to the regression test suite. Bugs tend to cluster in complex code; it does little good, and wastes time and resources to write and run regression test suites against code with little likelihood of breaking. By inserting QA into the mix early, it can help detect where buggy code is hiding and focus on it early. That not only helps development get the work done, but it focuses your regression test development on the problem areas.

And, finally — automate, automate, automate! No test should ever require any human judgment to run to completion. Warm-body testing is the most expensive way to test, and with close integration of QA and development to produce more testable code, it really should never be necessary.

Automated tests run far more quickly than manual ones, and this is reason enough to use them extensively, but they have more benefits than just speed. Automated tests encapsulate knowledge about configuring your program, making them ideal for helping new coders, both development and QA, come up to speed on your program. They do this because it is natural for automated tests to set up their own test environments, turning on features they are going to test, checking to see if they are enabled, and so on. These are the most time-consuming parts of running a test.

Automated tests do not need high-powered engineers to run them. Once they have been engineered, built, and packed into a regression test suite, the actual testing work can be handled by entry-level engineers (or even nonengineers), freeing up expensive QA and development engineers to test further, automate more and generally accomplish far more, than would otherwise be the case.

Shift It Left

We faced versions of all of these problems in my work at Digital/Compaq on what is now called "Tru-64 UNIX." Here's what we did to address these problems:

Involve QA early. Well, okay, this one is motherhood and apple pie, right? Yes, in theory. But in practice it isn't. Too often, testing issues fall by the wayside. QA is too often seen as overhead and not part of the development process. I've worked in shops where the relationship between the two sides was actually adversarial, as if it was part of QA's job to keep from shipping the product. That is simply wrong.

In fact, QA is as much a part of the development process as writing the code in the first place. "Designated Responsible Individuals" for QA of each task should be assigned when the task is first staffed, and though they should report to their own management, they should in all ways operate as part of their development team. They should be well known to all the team members, go to team meetings, and get briefed by management as an integral part of that team.

This is especially critical in dealing with the stigma issue. QA is a specialized task, often a black art, as much so as compiler internals, databases, or other esoteric programming disciplines. People who are good at it should be respected and valued. But when they labor in obscurity, visible only as apparently clueless bug reporters on some other floor, the lack of communication not only limits their effectiveness, but leads to a disdain for QA work. This actively discourages engineers who are good at QA work from getting anywhere near it. That hurts your project, and in the end, it hurts the engineers who cannot shine doing the work they do best.

Get test resources lined up at the start. When the Tru-64 UNIX security experts decided they needed a new system for log-in, they knew they were calling into question a lot of old, stable code that would need to be retested. And they knew they had no resources to deal with that testing. Too often this kind of problem is left to fester until it is a crisis, but instead, security decided to get QA involved on day 1. And so I was assigned.

Getting involved this early was unusual at the time in our company, but it afforded an excellent opportunity for me to test out my ideas about integrating QA. The results were worth the relatively minor effort involved.

Get tests into coding soon. What I did was to invite myself to all group meetings and drop by to get to know the people most involved in the work. I was actually involved while the functional specification was being written, so I wrote the testing specification at the same time and began coding the new authorization tests at the same time as coding began on the new authorization code.

In this manner I obtained high bandwidth, person-to-person communication with development engineers. I got early and useful earfuls of information about the areas they were worried about, which warned me where I needed to focus my testing. I also became known to them. They soon picked up that I, too, was an engineer of some talent, and I became part of the team, rather than just a check-off box in a QA plan. The developers were incredibly pleased at having a dedicated test engineer in their midst. The stigma that had been visible in other testing efforts evaporated overnight. When development can talk and work directly with their QA support, there is no "us and them." There is only the team and the new way of working. I was there to help pull the load, and my team knew that and appreciated it — far more so than they had ever appreciated the distant "QA Department" they had worked with previously. This attitude opened new doors.

Coding for Testability

Test code is a bear to write because it must deduce the correct operation of things from their effects, and how to do that is not always obvious. In particular, just being logged in, for example, didn't prove the new code was being executed properly. My testing specification had one tremendous advantage being so early in the cycle: I simply noted that I required certain optional messages to a system log. With that one line in my specification, I hugely simplified the test program because the authorization code was now designed to be easy to test. This was essentially free.

At this phase, it was trivial to request — and get — log messages that would've been a pain in the neck to add later on. The specification phase is the best place to discover these needs for that reason. Had we waited until the usual time — much later in the project, with much code already written — this would've had a much bigger impact, and would probably be impractical to do. The test code would be the worse for it — it might even be (you should excuse the word) manual — for without testability features, automation is much harder to do. Testability was built-in, not tacked on. Indeed, it saved so much effort that I finished my test program long before the authorization code itself was ready to be passed to QA.

Let Developers Develop, Testers Test, and Admins Configure

Since our current round of customer bugs was light, I took advantage of this slack time to put the test code to work doing advanced testing. That is, I used it to help out the developers doing unit testing of the code, long, long before they reached their first base level. This had two effects: First, it took some of the testing load off the developers, and second, it exercised and tested both the authorization code and the program designed to test it. Errors were reported directly to the affected coder, and in so doing we were able to sort testing bugs out from authorization bugs, proving out the test code as well as the authorization code.

It is important to note that test code is not special in any way. It needs to be specified, tested, and debugged as much as the code it tests. This is often forgotten in QA work. By moving this process forward in the test cycle, we not only helped development along, we also compressed the calendar time needed to prove the test code correct.

Another important consideration: Development engineers write test code all the time that is usually thrown away. But with a QA engineer in the cycle, this code can be salvaged. Some of the tests I wrote I actually didn't code at all — I merely adopted them from developers who wrote them as a matter of course and adapted them to automated use.

The authorization code was critical in every hardware configuration and so needed testing in a large number of configurations. As I noted the administrators tasked with these combinations setting up systems in the labs, it occurred to me that these systems made wonderfully preconfigured test cases for my software as well.

There is a lot of QA time wasted configuring systems. Typically, I would get a bare system, or one with just the OS installed, with no clue as to whether the features I needed to test were installed or enabled, and often I would spend more time prepping a system for test than actually testing. Here was an opportunity to avoid a lot of that hassle: I just started using the development test systems.

Prebase-Level Testing And Tight Reporting

With easy access to development people, management, and labs, it was easy to find plenty of downtime for a system already prepped to exactly the configuration I needed. I could run my tests in this downtime and report the results directly to the coders. And what's more, before each patch was submitted to the next base level it was tested with all other security-related patches against the previous build. It was, in effect, a prebase-level base level — with everything I needed in place to know if the next base level would have problems. I knew exactly what state each patch was in when it went into the next base level. Once the official base level came around, I could safely eliminate many tests based on my knowledge of how they ran in the prebase-level tests.

Speeding Up Base-Level Testing

Once I knew that a certain patch worked properly for both single systems and clusters, I did not need to retest it on QA systems in both modes in the actual base level. Rather then proving the functionality was correct (already accomplished), I needed only to confirm the correct patch was actually present in the base-level code actually delivered. This usually meant I only needed to run one test case, and usually a much simpler and faster one. In this case, it meant I could avoid setting up a QA cluster (we never had enough clusters), and only a small subset of tests for a single system — if the correct code was present, I knew from the prebase-level testing that it would be fine, and I could cycle the test resources to the next tester much more rapidly.

To be strictly rigorous, there is some exposure in this method. It is theoretically possible that some other patch from an unrelated group could affect some hardware permutation I skipped, letting a bug get through. This was controlled by making sure development's test systems were kept up-to-date with each new base level, and by doing a complete test run on certain critical base levels, such as the ones preceding the beta release or first customer ship. With this scheme, interaction bugs are picked up on the base level following their introduction (during the prebase-level testing for the next base level), except for critical base levels. This exposure is extremely low compared to the number of bugs we could catch with the addition prebase-level testing we could accomplish in their place. In practice, we had no bugs of this sort at all.

For most base levels, therefore, I could reduce my QA hardware needs by at least 75 percent (a three-node cluster and single system down to a single system) because I had already run the tests in both single system and cluster systems 10 to 20 times, including at least one complete run of all configurations on development's final test run of the patch before hand-off, and more often two. In short, I was getting 10 times as much testing on each base level, on machines that were essentially idle otherwise.

This system also meant I seldom found a bug on the QA side of things, and therefore did not have to report it through the expensive and painful bug-tracking system. Indeed, I didn't even need to loop through management. When I found a bug I simply walked over to the developer who wrote the code and ran the test for him or her. Without the overhead imposed by the cumbersome customer bug-tracking system, bugs could be fixed in minutes that used to take days — in fact, often enough I could get another run of my tests in proving the fix that same day. Being able to pinpoint a precise test case in an automated suite meant I had little trouble communicating the exact bug to the developers, and their familiarity with me and my test work meant they could use the test suites themselves for unit tests.

Conclusion

Bugs are cheap when caught young. You can catch bugs earlier by making QA a part of your development, not just part of the release process. This can save you calendar time because more work can proceed in parallel. This can also save you resources, because hardware and expertise can be shared that would otherwise be duplicated. It provides much more testing overall, reducing your exposure on critical bugs and making them cheaper to fix. And by tying QA to development, you make the statement that QA is "real" engineering, and you encourage people who are good at it to do it — and that's a big win for everyone.

There is a bit of a downside to this method, though. It evens out your personal workload so effectively that you become nearly immune to the usual crunches that can affect everyone around you doing it the old-fashioned way. This can make you look rather "under-utilized," as the managerial catch phrase so delicately puts it, at least to your own management in QA. It is wise at this point to schedule a status meeting with your own boss and the manager of the project you are working with in order to disabuse management of this notion. By working smarter, not harder, you can get far more done. But don't let it look too easy.

DDJ


Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 

Video