Channels ▼

Community Voices

Dr. Dobb's Bloggers

Five Questions With Jim Bullock

June 23, 2008

Jim Bullock has been building software for twenty-five years. He has been writing articles and books which engage your mind for almost as long. And I am sure he has been an interesting person for at least as long.

I remember the first time I met Jim in person. I figured we'd talk for an hour or so and then I'd be on my way. Instead, we talked for over four hours. We stopped then only because we each had supper engagements!

One reason Jim and I talked from one meal clear through to the next is that both of us want to understand most everything. Jim has a knack for asking questions which uncover other questions which uncover other questions and so on until he uncovers the root of whatever matter is being discussed. I discovered this helps him be an excellent lunch companion. A plethora of companies have discovered this also helps him move companies from floundering to fabulous.

Jim pays attention to who does what, when. He calls this Conscious Development. I believe he would join me in also calling it fun. Here is what Jim has to say:

DDJ: What was your first introduction to testing? What did that leave you thinking about the act and/or concept of testing?

JB: I learned to develop software supporting University research labs. Everything was novel, we needed it to work and suspected it wouldn't. So, I was first exposed to software testing with real questions, mostly about my own work, and a personal interest in the answers. That environment was also hugely influential to me - people creating experiments to learn something they wanted to know. We tested our software from the same POV. In testing software I think what you want to know comes first. From that you invent experiments and the gear to support them. We ask software testers to do experiments on software for us because they are crafty and find interesting things.

The first independent software testers I worked with tested my team's heat pump control software at Carrier Corporation. It was both disconcerting and comforting having other people poking at our stuff. Most of the problems found we'd correctly implemented the wrong thing. After that we got testers looking at plans, requirements, designs and so on. You can test anything, and it's worth it. Something happens in people's brains when they think in terms of interrogating the thing in hand. I also learned the value of independent eyes. With those systems we developers were sure what the system should do, and we were wrong. It turns out your own understanding blinds you.

Ever since that experience at Carrier, I try to get ignorant eyes on every work product. To start with, independent testing is a kind of ignorant eyes not blinded by what you thought as you did the work. With code I organize development in a Lean / FDD hybrid with strong gating. Then, every integration gate includes a developer who didn't work on the change and a tester. As new eyes, those reviewers are really a test case, too, standing in for maintainers who come along later. A successful system is going to outlive the original team, so "somebody new" and "recovering history" are two use cases every system will experience. Why not test for them? (Actually the Ingres engineering VP talks about this in the WSJ business technology blog - first week of June, 2008)

I organize testing the same way – Lean / FDD work flow with strong gating and lots of "ignorant eyes" along the way. With testing the extra eyes are especially helpful since testing only gets action when someone else accepts the results. The extra eyes test your delivery as well as your results, and how it plays for someone else, not just for you.

I was first responsible for independent testing while running the implementation of a 3-city, distributed IT system on a new product. I oversaw the vendor's development and testing as part of our product acceptance, and managed our functional testing and specialty tests like performance and data conversion. From this I learned that independent testing also flushes out configuration management & build, requirements management, and development practices – really any kind of communication. Since then, I always make a "process error" defect category for things like these. I also learned that in evaluating your product other people assess how you did the work – explicitly or on the sly. So, it is good to have a compelling, transparent story of what you did and why. Then test your story.

Since then, in PM and development roles (including life cycle tools development, which is fun) I co-created a testing consulting practice in the mid-1990s serving multi-tier enterprise IT development projects. After that I spun up several QA / test teams for both web site companies and products. Each company or project is a little different because different systems have different questions. "Correctness" sure, but correct in what sense? Under what circumstances? By the way, what are the project and company constraints on the testing we can do? Somewhere in there people decided I was a tester. I'm not so sure. I mostly align the team and tools to the needs at hand. Sometimes I get to do testing myself, mostly around architectural risks or performance.

DDJ: What has most surprised you as you have learned about testing/in your experiences with testing?

JB: I'm surprised by people focusing on ritual vs. results. I've seen testers insist on doing stuff that finds nothing. I've seen developers insist that "You must test this way" when something else is finding lots of valuable results. I've seen testers and developers ignore impassioned questions about the software because their preferred techniques do something else. I've seen testers & developers ignore the performance that the project or organization needs from testing.

I'm baffled. If testing one way finds nothing, try something else. If testing another way finds things that matter, well, that result is all the justification you need for the technique. If there's something you need to know about the system you have, figure out how to discover it. I think testing starts with questions: "What do we want to do better?" and "What would we like to know more about?" followed by: "How do we find that out?" and "Is the testing we are doing helping?"

I sometimes find testers, developers, PMs & others griping about each other, too. "Those guys take too long." or "Those guys build stuff we can't test." Sometimes people act like those other guys can fail but somehow we'll still be OK. Maybe it's because I bounce between roles but I don't get that. It's not a lifeboat where the guy who falls out is in trouble while we're OK. It's a human chain where if anyone is in trouble we all are. We need these people. Measure your contribution in terms of how you create opportunities for others to contribute in turn.

When I'm PM-guy or running development, I tell testers: "Here's what I could use from you." Then I ask: "What do you need from me to be able to do your job?" and "What else can you offer?" The better they do their job, the easier my job, so why wouldn't I? When I'm doing testing I focus on the kind of information development, sponsors and Software Process Improvement (SPI) need about the system, and the kind of performance in doing testing that will help them succeed. Techniques follow from that.

DDJ: Is there something which is typically emphasized as important regarding testing that you think can be ignored, is unimportant?

JB: There's a lot of arguing from position that I don't get - "schools of testing", methods and movements (the current loud one is "Agile"), and individual techniques ("unit tests" and "model driven testing" for two.) I don't get it. The pragmatic people I know go: "Interesting . . ." to anything that does some good, and "Well, duh" to the limitations. Professionals have a kind of informed open-mindedness.

The other day on his blog Keith Braithwaite quoted an old-school paper by D. L. Parnas. He points to an article by Michael Feathers, that refers to something from a while back called "Clean Room Software Development." Clean Room Development is nearly the opposite of "Agile" in many of its choices, and it works, too. (Maybe if you turn the knobs down to 0 that's also interesting.) Testing in Clean Room Development works stochastically from the outside exploring a formal model of the state space the system will encounter. So Feathers and Keith, both of whom would probably call themselves "Agilists", were both looking for interesting ideas, observing what happened, and seeking to understand. Isn't that the testing mindset? Feathers concludes that techniques that work encourage reflecting on how your stuff is supposed to work. That's the right way to do it. Arguing from position is for amateurs.

Arguing from a position in testing is most virulent around techniques. For example "All tests must be automated" is silly, while declining to automate tests you sensibly can is just as dumb. For all the good of the current emphasis on "developer testing" - and by the way, when was developers not testing their work OK? - some folks get narrow-minded about it. Developer testing by itself is sufficient? What about those ignorant eyes? Or more subtly, does xUnit give you a design technique called TDD, an automated regression suite making refactoring safer or a harness for building interesting tests that operate at APIs? I pick "all of the above", and "do what makes sense", but you'll find people arguing from position that it is one or the other.

The thinking seems to get shallow when advocacy sets in. What's left seems like partial descriptions to me. Sensible testing does something like "V model" if you look at it that way, but isn't just that. Clean Room Development's stochastic testing is "black box" but not "black box" in the way many people mean. It's incredibly disciplined and thoughtful. I think a professional tester is able to pick among the different approaches and tools out there. I think a lot of people have a good and useful partial description of testing software. A great example of this is Cem Kaner's paper on 100 different kinds of test coverage. What, 99 of those folks were being silly? Or maybe each thinks their kind of coverage makes sense in their situation. Maybe it does. Now, you pick the kinds of coverage measure that make sense in your situation.

There's a better way:

  • The risks and priorities of your system and project point to questions you want to answer better.
  • The tools & methods that fit come from those questions.
  • Whatever you discover is valuable information.
  • Do the most useful thing you can, in the situation at hand.

DDJ: Going meta (to channel Jerry Weinberg), what else should I ask you?

JB: What doesn't get enough attention and should?

DDJ: What would you answer?

JB: SPI doesn't get enough attention, and the little it gets is wrong. If you're going to have testing or QA (they're different), have ongoing SPI too. Too many teams keep making the same old mistakes then finding them in the same old way. Everybody is busy but it's dysfunctional, literally codependent. Each group creates an artificial need for the other. What would testing do if development eliminated their usual ways of screwing up? What would development do if testing started finding new categories of problems? What is either group doing about the problems that got past the way they work already? If the other guys learn and change how they work, you'll have to change how you work too, and that's uncomfortable. With testing or QA you have data to drive that if you are willing.

I saw this at a small company client where I was Interim QA & Test guy, then Interim Engineering VP. Doing a maintenance release addressing a dozen known, critical problems testing found 60+ more, equally critical defects and bounced every attempted repair at least once. The regression rate of changes was steady over four months. After I moved to Engineering, it was three months of shipping both 50+ repairs / month and new development before testing found their first problem - same guys, same lab vs. the same product. Five months and counting after finding that problem testing had yet to find another. Meanwhile, via investigations to resolve failures in the field engineering identified several new common kinds of faults in the system and set about removing them across the whole product - hundreds, still with no apparent regression.

The difference was data-driven SPI. First, in test we used the defects we discovered to get more crafty about exercising the system. If configuration variations were failing, we'd try more configuration things until that stopped being fruitful. Same with techniques or approaches to testing. Meanwhile, engineering kept working the same way they had. Later, in engineering we had 4 months of data telling us our most common kinds of mistakes - code management screw-ups, problems ill-understood before repair and design-free changes. (That's in DeMarco's "The Deadline" and it fits with my experience. Once you straighten out the life cycle and tool chain at all, the vast majority of "bugs" are under-design.) So, we straightened out code management then added "explain the problem" and "explain how your design works" to change gating and our results got better.

By the way, I don't believe we had no regression over those eight-plus months. I believe we didn't notice any in engineering and didn't find any in testing. That made me nervous. Somebody should have been thinking of other ways to poke at the system, to find other stuff. In test / QA the team and I had identified 10 kinds of testing we could throw at that system - each "kind" a type of fault and a technique to get at them. We'd executed on one, expanding the coverage over 10x. We'd started applying two others when I transitioned to Engineering, and that's where it stopped. Testing wasn't finding much new because they weren't trying anything new, while Engineering was getting smarter about how to work.

For SPI to work we have to get away from thinking that "SPI = progress toward the one true way." I see too much process-ritual in developing & testing software - stuff that Must Be Done(tm). Better to look at what's working and not working for us, here and now. What does our experience tell us? The way we do testing or development is also a test – an experiment in building software a particular way. If what we try doesn't work, try something else. We also need everyone involved in SPI to look first at their performance, then at the performance of development as a whole. Too often "SPI" means "scolding other people." Too often "SPI" means "sub-optimize the whole to optimize my part for me." I've seen testers and developers do that, and it's silly.

In the example I gave we mobilized resources to let people work better. Testing got permission to go after additional questions and ways to poke at the system, as long as they were fruitful. In engineering "strong gating" really gave management cover to engineers taking the time to think through what they were doing. Effective SPI is really about noticing what you could do better, then allocating some resources to doing that. There is no scolding in effective SPI.

DDJ: Is there anything else you would like to say?

JB: Personally, I emphasize the human interactions around testing and the value testing delivers in the organization. I may have a selection bias, but in my experience effective testing and software development depends on interactions first, processes & practices second and specific techniques last. Fix the former and the latter correct themselves.

For example, my "Big Book of Perfect Software Testing" - a lightning talk and one page of notes - talks about the value testing delivers. Twice that I know of it's been included verbatim in other people's training, which is quite cool. My "Calculating The Value of Testing" in STQE (now Better Software) talks about the value of testing as a business function from executives' POV as sponsors.

On human interactions, Brain Branagan and I presented "The Three r's of Software Testing" about ways for testers to keep their heads in the game. Stresses in a project or organization come out in testing, in part because testing done right injects reality. "The Three r's" is about working with that. There's an article that goes with the presentation that I probably should publish. Or look at "Chinese Contracts" on the AYE Conference site. That's about agreements between people working together, like testers, developers and PMs.

Oh, yeah - hire me. I say I do "conscious development." It's amazing what happens when you pay attention to what you are doing when you develop software. If you like what you are hearing contact me - contract or direct / permanent for the right gig in Seattle.

 

[See my Table Of Contents post for more details about this interview series.]

Related Reading


More Insights






Currently we allow the following HTML tags in comments:

Single tags

These tags can be used alone and don't need an ending tag.

<br> Defines a single line break

<hr> Defines a horizontal line

Matching tags

These require an ending tag - e.g. <i>italic text</i>

<a> Defines an anchor

<b> Defines bold text

<big> Defines big text

<blockquote> Defines a long quotation

<caption> Defines a table caption

<cite> Defines a citation

<code> Defines computer code text

<em> Defines emphasized text

<fieldset> Defines a border around elements in a form

<h1> This is heading 1

<h2> This is heading 2

<h3> This is heading 3

<h4> This is heading 4

<h5> This is heading 5

<h6> This is heading 6

<i> Defines italic text

<p> Defines a paragraph

<pre> Defines preformatted text

<q> Defines a short quotation

<samp> Defines sample computer code text

<small> Defines small text

<span> Defines a section in a document

<s> Defines strikethrough text

<strike> Defines strikethrough text

<strong> Defines strong text

<sub> Defines subscripted text

<sup> Defines superscripted text

<u> Defines underlined text

Dr. Dobb's encourages readers to engage in spirited, healthy debate, including taking us to task. However, Dr. Dobb's moderates all comments posted to our site, and reserves the right to modify or remove any content that it determines to be derogatory, offensive, inflammatory, vulgar, irrelevant/off-topic, racist or obvious marketing or spam. Dr. Dobb's further reserves the right to disable the profile of any commenter participating in said activities.

 
Disqus Tips To upload an avatar photo, first complete your Disqus profile. | View the list of supported HTML tags you can use to style comments. | Please read our commenting policy.
 


Video