Last Tab
Space missions, unlike software projects, can't use iterative design. You simply don't release an incomplete Version 0.1, get user feedback, tweak the design, release Version 0.2, and continue iterating until the code works well enough. Instead, you spend a few hundred million dollars over a decade or so, and get exactly one attempt. Should you overlook something, you must start all over.
Complacency at any point can kill such a project and, I think, accounts for much of ordinary software's error rate. If we all did what we know we should, most of the errors simply wouldn't happen. Be honest: Have you actually analyzed all the possible failure modes for that fancy SOA project?
A simple fault tree for your current project will prove an enlightening activity, if only when it shows how much code you cannot control. Even better, it might reveal ways to improve your own error handling and prevent hostile intrusions.
The more procedures you put in place to prevent errors, however, the more opportunity you'll have to simply ignore those rules in order to get the job done. Has anyone ever factored that into a management decision?
Slides from a NASA Safety Directors Meeting are at http://ohp.ksc.nasa.gov/ conference_info/2006/shconf/pdf/02-01-2006_ChandlerFaith_web.pdf. The Mars Polar Lander report is at www.jpl.nasa.gov/ marsreports/mpl_report.pdf and the N-Prime MIB Report is at www.nasa.gov/pdf/ 65776main_noaa_np_mishap.pdf.
More on NOAA satellites at http:// noaasis.noaa.gov/NOAASIS/ml/genlsatl .html, with a satellite user guide (!) at www2.ncdc.noaa.gov/docs/klm/index.htm.
No combination of keywords turns up the NASA newsletter containing the N-Prime director's summary of the risk-assessment meeting and that fateful phone call, but it's out there somewhere. Memo to self: Always save the links.