All you want is to grab a quick gallon of milk and some bread on your way home from workjust get in and get out. Then you find yourself eighth in line at the register behind four people who decided to do their entire week's shopping that very hour. Groan.
Even as our appetite for convenience and immediate gratification grows, new hope is on the horizon for grocery shoppers everywhere. "Self-checkout" may not be in your local grocery yet, but it's definitely in your future. How do I know? Not long ago, I took part in a project that used an agile development approach to create a 24-by-7 retail point-of-sale (POS) product built in Java on a distributed, event-driven architecture. The developers weren't giants (but they were very good), the facilities and labs were far from perfect, and we hit major obstacles at every point from firmware configuration to integration with legacy systems. But despite the deep potholes and the tire tracks up and down our backs, it's a success story about how two innovative companies took a chance on a different way of developing some very complex software.
Software development process has been a hot topic for decades. But in the last few years, this debate has taken an abrupt turn. Some new kids on the blockagile or lightweight processesare tearing up the turf of the resident heavyweightsthe traditional waterfall approach (first proposed by W. W. Royce in a presentation at the Western Electronic Show and Convention in August 1970 and popularized by Barry Boehm in "Software Engineering," published in IEEE Transactions on Computers, Vol. C-25, No. 12, in December 1976) and the Rational Unified Process (RUP, introduced to provide a process framework for UML-based products in the 1990s).
The agile techniques come in various flavors: Extreme Programming (XP), SCRUM, Adaptive Software Development, Crystal, andof coursereckless hacking (see "Put Your Process on a Diet" by Martin Fowler, Dec. 2000). Of this gang, XP has had the most provocative press, and is the de facto leader of the pack. XP is devoted to the overarching principle: Simplify wherever you can. This trickles down to a set of specific practices, among them continual refactoring, collective ownership of code, 40-hour weeks, simple design and small releases.
According to the XP gurus, each of these practices on its own may have little benefit for a project. All of them must be adopted to gain maximal, or even observable, benefit. And there's the rub. For many organizations, adopting all of the XP practices at once is a step into the deep unknown, not unlike the step of blind faith Indiana Jones took in The Last Crusade (which happily landed on a camouflaged foot-bridge). All companies want the benefits of faster development, but not all are willing to let go of a securely tied safety line. This is where RUP comes in.
RUP is a large process, but it was designed to be tailorable through a technique that Rational called the development case. It can be a ponderous, suffocating beast, or it can be trim and fit. The problem is how to construct a development case that clearly identifies the essential steps to perform and the artifacts (models and documents) to develop along the way. In this area, RUP is its own worst publicist. Out-of-the-box RUP doesn't do a very good job of indicating what's essential, desirable or superfluous.
Structurally, RUP is similar to the waterfall approach: Do requirements first; then do analysis, design, implementation and testing (the RADIT cycle); finally, release the product. But, whereas the waterfall approach does each of these once in a project lifetime, in RUP we iterate within the project lifetime. This means we take a 12-month project and divide it into 13 four-week iterations, or mini-projects with their own specific goals. We do the RADIT cycle over four weeks and deliver an incremental releasea working system that performs a defined subset of the total system functionality. Using this iterative, incremental approach, RUP allows us to "eat the elephant one bite at a time." Defining the bite-size is the real challenge. The grocery system discussed here was developed using a bare minimum of RUPhence, "Extreme RUP" (XRUP)with very small bites.
Moving Toward the Extreme
The QUICKcheck brand self-checkout project started with the usual goals: build it fast, build it right, andby the waybuild it fast! But this wasn't a garden-variety Internet application: It was a combined and parallel software and hardware development project with nary a browser to be found. Worse, it was a POS (point of sale) system for grocery stores that allows shoppers to check out and pay for their own groceries without a store cashiera real shopping cart application! The system consists of four Customer Work Stations (CWS) and one Attendant Work Station (AWS), at which a store employee monitors and intervenes in the checkout sessions on the CWSs. Each CWS and the AWS can also be operated as a regular, staffed checkout lane to meet peak-demand needs.
When I began this project, all of this sounded pretty straightforward to mesimple, even, like an ATM. I was wrong. When I first examined an architectural diagram I had built for our system, I realized that the point-of-sale world was far more complicated than I had anticipated. In a normal grocery store, a human cashier does your checkout and makes sure you don't walk out with a roast hidden in your overcoat. This real cashier takes your money and gives you change and won't let 15-year-olds buy beer. All we had to do was build a system that did all these things with softwarebecause there would be no cashier. And to accomplish this, we had to know about all 100,000 items in some grocery stores, including their physical weight. We had to interface to the back-office POS system, which allowed more than 250 individual configuration parameters to be modified in any store, and we had to perfectly match each back-office configuration. We had to enforce different rules to prevent selling cigarettes to minors, and to prevent selling liquor on certain days or during certain hours. We had to detect possible fraud and theft when a customer scans a five-pound bag of sugar and drops a five-pound roast into the grocery bag. We had to handle seven different pricing methods (by weight, quantity, group and so on) and eight different payment methods, and allow for a shopper to pay with one, two or more methods at a time, and in multiple currencies. We had to attempt unattended recovery for every exception, and allow a remote human attendant to monitor and even redirect every checkout transaction in real-time. And these were the easy constraints. As I sat there, the words "rocket science" came to mind. Actually, I've done rocket science, and I think POS systems are more complex!
When I joined the effort, Portland, Oregon-based PSC Inc. (www.pscnet.com), the system vendor, had selected Greenville, South Carolina-based Kyrus Corp. (www.kyrus.com) to develop the software. PSC, manufacturers of the ubiquitous Magellan scanner/scale device for supermarket checkout lanes, had four years of experience with an earlier commercial self-checkout system, but they wanted to produce a much-improved second-generation product. With origins in the cash register business (the company was founded in eastern Tennessee in 1974), Kyrus is a leader in POS systems and is the largest reseller in the U.S. of IBM POS systems. These two enterprises were a perfect match for the project.
I was engaged on this project through IconMedialab (www.icon-stlouis.com) as the object-oriented programming mentor and software architect. I hadn't done a POS system before, but my role was to train and guide the software team on OO techniques. The software vendor had selected Java as the development platform and wanted to follow an iterative approach, so we adopted RUP as the guiding framework for the project. But we slimmed down RUP according to a mini-RUP process (see www.evanetics.com/Articles/Project/IterativeProcessOverview.pdf) that I had defined on an earlier project for Kyrus. My tongue-in-cheek name for this is "Extreme RUP" because it fits on one page, focuses on simplicity, does only what is essential, and moves fast without sacrificing the practices of modeling, design or explicit architecture exploration.
At the start of the project, the software team consisted of two programmers who were very experienced in grocery point-of-sale, but didn't know Java and had almost no OO background. They added two Java contractors who knew nothing of POS systems and had learned OO techniques on the job. As the fifth person on the team, I brought my OO background (I've worked exclusively in object technology since 1993), enough Java skill to be dangerous and virtually no knowledge of POS systems. As a group, we knew a little about everything, but technical expertise alone isn't enough.
"Hiring a dedicated engineering team manager who focused on this product development only" was the smartest thing the software group did, says Paul Denimarck, director of Self-Checkout Systems at PSC Inc. This manager was experienced in developing self-checkout systems and acted as domain expert to fill in the business details that were missingfor example, how do shoppers react to audio direction? Or, what is our best response to a five-pound bag of sugar that is scanned but never appears in the grocery bag? Since the QUICKcheck system detects possible theft or fraud by weighing each item you purchase, this expert helped solve our dilemma concerning the differentation between the labeled weights of individual bottles of cola or shampoo and their actual weights. Before we added the project development manager to the team, the programmers were making it up as we went along. Imagine five (male) developers role-playing how the system should react to a 22-year old woman with three screaming children whose box of pasta won't scan. The number of possible scenarios was staggering.
As with an XP project, we kept the number of developers small. Two of PSC Inc.'s programmers developed all the software for driving all the external devices such as the bill acceptor, produce scale and weight-security scale. A team that varied between four and seven developers implemented the core application, which topped out at around 93,000 lines of Java code.
Making the Business Case
When I joined the project, the partner companies had already agreed on the business case, an initial budget and a schedule. In theory, they had fulfilled the goals of the RUP inception phase, but the team and I saw only unanswered technical questions. So we pushed the project back into the inception phase, to what I call "iteration zero," and spent four intense weeks trying to identify those questions that needed resolution before we could begin technical iterations on the product. During iteration zero, we wrote the use cases that captured the major operational personality of the system, and which contained the highest risk elements or most architecturally significant behavior.
For example, our initial use cases had to specify how we would handle bar-coded, non-bar-coded and bulk-purchase items, each priced by weight or by quantity, in addition to accepting eight different methods of payment. Our number-one risk was transferring events between each self-checkout workstation and the monitoring workstation for display to the store attendant. Java JMS was more than we needed, so one of the Java contractors wrote from scratch a publish-subscribe manager that used Java's Remote Method Invocation. Our number-two risk was implementing the price lookup and receipt printing behavior, which required our interfacing to the back-end POS system (more on this below).
At the end of these four weeks, the team had produced a very brief architecture description and 17 pages of issues and questions. The marketing and field engineering groups on the project staggered when they saw this, but our team felt pretty good. We had explored the entire geography of the system and achieved the "I Know Enough" milestoneso we charged ahead even with most of these business and technical questions unanswered. Groups accustomed to Big Requirements Up Front cringe at this apparently reckless attitude. But it's not reckless at all; rather, it's the defining characteristic of a controlled agile process. Only by moving to the next activities would we understand what we really needed and what we had missed. It was time to elaborate, and over the following weeks, we eventually got the answers we needed to the really important questions.
One area of spirited dispute in the XP philosophy is the role of documentationespecially when it comes to models as a form of documentation. XP declares that the actual system code is the only trustworthy documentation. There is some merit in this idea. There's also ample room for dissent. Some people can read program listings and immediately discern design and system goals. Most can't, however; therefore, source code may not be the best form for them. Many people find it easier to grasp abstract ideas through visual models such as UML diagrams.
Models can fill different roles in a software project. Sometimes we model to understand, using visual techniques to explore new concepts. At other times we model to communicate, to capture in a visual form stable concepts that we want others to appropriate. (Thanks to Scott Ambler for popularizing this distinction.) Since our team had patchy modeling expertise, we used models to communicate with each other, but more extensively to understand the system we were to build. It paid off.
We tried to establish the large common area advocated by XP for open contact and communication. Unfortunately, bureaucracy won out and the team members were assigned cubes instead, so we commandeered a large conference room lined with whiteboards. This became our "war room," and it was here that we did our modeling as a team activity.
Everyone on the software team was involved in the modeling. While XP avoids modeling as a discipline, RUP is, not surprisingly, given its origins, model-intensive. We modeled, but only as necessary, and early models on our project were developed also to ingrain object-oriented design concepts into the team members. Russ Pridemore, one of the Java contractors on the project, became a strong modeling advocate. "Our modeling activities helped us understand the problem domain and how to attack it," he says. "By starting simply in early iterations, we were able to prove that our approach was workable before we were too entangled in details." When a team is not deeply experienced in visually expressing abstract concepts, they learn only by doing the modelsjust like learning integral calculus.
All of our models were developed on whiteboards, erased, redeveloped, modified, thrown out and resurrected as needed. In the early iterations, we regularly captured models in an OO CASE tool to ensure that we had a checkpoint for our discussions. We admittedly used these models as a safety net. Later, we were less retentive and captured models or changes only after they had stabilized. In the later half of the project (RUP's construction phase), it was the code that we cared most about.
All of our modeling centered around a very small core set of UML diagrams: class diagrams, sequence diagrams for the important scenarios and a few statechart diagrams. (These last, by the way, were for the weight security classes, the absolute most proprietary and well-guarded heart of this system.) A typical three-week iteration consisted of five days (if needed) of requirements capture, plus analysis and design modeling for the features in that iteration. Then, the team spent the remaining two weeks coding and testing what we had modeled. At the end of each iteration, we did a one-hour review, decided what to change and launched into the next iteration. Each iteration delivered a working executable. "We would never have been successful if we had not done the models early in the project," says Dale Hughes, the development manager on the team, "and they were especially important for the coding activities."
A question that always comes up is, "What did you do when the code no longer matched the models?" Answer: usually nothing. Here is where XP minimalism paid off for us: If the code was close to the current models, we didn't waste effort bringing them into perfect alignment. The models are not the system; they're a means of understanding the overall approach. If a new member joins the team, the vision statement and the architecture specification are enough to learn the "big picture." After that, reading the code is the best way to learn the system.
Based on project scope and our staffing level, we developed our project plan. This was a "real" project plannot a Microsoft Project printoutbut still only 16 pages long for a nine-month project. (The template is online at www.evanetics.com/Articles/articles.htm.) Our original plan decomposed the feature set of the system into 13 iterations of three and four weeks each. Each iteration was "time-boxed"we agreed to stop at the allotted time even if all the features were not completed. This approach of small, well-defined and controlled iterations is central to any iterative, incremental process, including both XP and RUP. Eventually, the plan was reorganized into a total of 14 iterations, some of which lasted up to nine weeks. These latter iterations were extended to ensure that each feature set was implemented completely and tested fully.
We selected the content of each iteration based on the known dependencies among the features, and assigned scenarios from the system use cases to each iteration. Detailing the content of each iteration followed a simple recipe: While conducting iteration N, write the Iteration Plan for iteration N+1. This allowed us to focus on those tasks immediately confronting us. No matter how hard we might try to describe detail for three months down the road, we'll never get it even mostly right. But when we're actually laying down code today, our perspective of what we can do in a week or two is measurably more accurate.
Software integration is a challenging issue on any non-trivial project. XP advocates continual (sometimes hourly) integration whenever any change is made and tested. RUP is more formal, but defines integration via multiple builds within an iteration, with each build being integration-tested. Again, we took a moderate approach. One of the team members did the integration of the iteration contents, usually every couple of days. Each build was then verified and turned over to QA for system testing. All of this occurred within an iteration, and eventually the QA testing coincided with the beginning of the next development iteration.
Iteration one developed a basic messaging system that managed the core of the CWS and AWS. Iteration two developed a minimal interface for the AWS to monitor the events coming from the CWS. Iteration three physically connected a CWS and an AWS and passed live messages between the two. Each stage culminated in the delivery of a working executable (in the form of a JAR file, which is a convenient way of combining class files with their associated GIF images or other resources) to PSC, our internal customer. The testing that PSC did on each delivered iteration was invaluable. Having your customer install, test and break your software is humbling. It's also painful: Billings were tied to iterations, and if our software didn't work, Kyrus didn't get paid! Early problems involved silly JAR file installation issues and DLL mismatches. Later, we encountered threading defects while simultaneously driving peripheral devices such as the bill acceptor and coin dispenser, as well as problems configuring our workstations appropriately for the POS environment configured in the PSC labs. Those of us who were new to POS were stunned at the configuration and customization problems that could arise in even a simple POS environment.
Integration with legacy software was another challenge. In iteration four, we had to go "live" against Supermarket Application (SA), the IBM back-office point-of-sale system. SA knows everything about what is being sold in a grocery store, and we needed to request services from it. However, SA has no application programming interface: It talks only to devices, like the 50-key POS keyboard a cashier uses, or the 2-row-by-20-character display on a checkout lane. Our system had to send directives to SA by emulating keystrokes from a POS keyboard. We then had to determine SA's state and responses to our directives by eavesdropping on the text strings sent by SA to a 2-by-20 "lollipop" display, or receipt printer.
We estimated that iteration four would take a total of three weeks. At week two, however, we were floundering. The back-end POS system wasn't responding as the documentation indicated. We discovered that the overall request/response protocol was not synchronousthe responses did not always follow the order of the requests. The team agreed that we needed more time to integrate with SA. We jettisoned the time-box (if we didn't get the interface working, we wouldn't have a system at all) and announced a three-week extension of this iteration.
The next day, the vice president of development walked into the war room. "Gary, I hear you're slipping your schedule. What's going on?" I explained the technical obstacle and why we believed we could overcome it in another three weeks. He was clearly worried, and said, "If you're slipping this early in the project, then it's going to be even worse near the end." I understood his concern: He was looking at this slip from a waterfall process perspective. "No," I explained, "We're slipping now so we won't slip later. We're attacking the highest-risk elements now so that we'll be picking only low-hanging fruit later." This didn't make him any more comfortable, but, to his great credit, he left us alone. The team focused exclusively on interfacing with SA, and we completed iteration four one week prior to the extension deadline. The point of this example is to illustrate the obvious: If something goes wrong, you do what is necessary to fix it. We had chosen an agile process, which allowed us to make the necessary changes as easily as possible.
This was not the only obstacle that we had to overcome. And, yes, we missed our original schedule target by several months, and this incurred a sizeable budget overrun. Focus is crucial, and so is a realistic perspective. Despite our best efforts, our early understanding of the system's requirements in April 2000 turned out to be rather naive eight months later in December 2000. "As we built the system in small pieces, we all saw that we had underestimated the depth of effort required to build the total system," says Don Bellis, PSC's technical interface to the software group. So we adjusted in bite-sized feature changes.
By the time you read this article, QUICKcheck will be in a sane, controlled deployment in a major grocery chain. It was built in small steps that resulted in a major product achievement. Albert Einstein once said, "Make your thinking as simple as possible, but not any simpler." Every software process can be lightened, but not every process needs to be stripped to its absolute minimum. There is more to software development than just writing code, and the process must meet multiple, competing goals. This article is an example of how a relatively heavyweight process, RUP, was adapted toward the philosophy of XP without adopting XP as a whole.
Will I do the next project differently? You bet. XP guru Ron Jeffries made this haunting statement on an Internet discussion group: "If you eliminated one-half of your process, would you miss it?" When I read that, I immediately saw that today I do less than one-half of what I did just three years agoand I don't miss it. Quality is higher. Teams are happier. Changes are easier to accommodate. On my current project, I'm doing fewer formal diagrams and running even shorter iterations. Maybe I really haven't discovered "Extreme RUP" yet!