| 1 Test

4 Ways to Make the Ship/No-Ship Decision

January 13, 2011

At some point, you have to declare the software development project “done” and move on to the next one. But the decision-making process is different in many environments. Here’s how to make the decision… for better or worse.

I fondly remember my graduate course in software development management at Grand Valley State University. Good old CS 651 made the ship decision seem so easy. Why, you just shipped the software on the day the test phase ended, after all, and the testing phase ended on the day the schedule said it would.

Looking back, I am most amazed at the advice the CompSci textbook gave me. Oh, not at the book’s naiveté or its academic approach; that was to be expected. I was more amazed that a 600-page book about the software development process never addressed such a critical question.

To be fair, we don't do that much better at the “ship/no ship” decision in industry. The testing literature suggests that, outside of certain types of regulated projects, the tester’s role is to inform, not to decide, and the ship decision is best left up to general management. The Agile literature tends to either advise a "whole team" approach or suggest the decision is up to the product manager, the "single wringable neck" responsible for the tough choices.

Yet how, exactly, can a product manager make up his mind?

In the last decade, I've seen a few ways to make the call. I present a four ways I’ve seen for “when to ship” decisions, along with some of the tradeoffs involved with each technique. (And, unlike that textbook, I do it without the big words or fancy symbolic notation, thank you very much.)

Here are my four favorite ways to make the ship/no-ship decision, from worst to best:

1. The Magic Deadline, or “The Big Boss Said So”

Sometimes the deadline is forced upon you: a trade show, the Christmas holidays, a contractual agreement. In this case, I prefer to structure the project incrementally, building the most important features first, so that developers, testers, and stakeholders can compromise on scope instead of quality.

Occasionally, compromising on quality can work. You might only show a demo at the trade show, and not actually sell the finished product. Or you may have an accepting user community. For example, Twitter users are famous for putting up with outages.

On the other hand, Twitter is free. Is your software free? Are people relying on the software to get their work done, not just to keep in touch with friends?

2. No “Show Stopper” Issues

There’s a reason they call them “blocking” defects – their presence means you don’t ship. So one way to manage the project is to proclaim, “We don’t ship until all the blockers are resolved.”

The “show stopper” method is about defining the earliest moment you can responsibly ship software. The most attractive benefit is that it’s also the cheapest.

Speaking of responsible, the “no show stoppers” ship methodology does not mean you can bang out code and not test at all. The “no show stopper” moment needs to be at the end of some sort of release-testing process, which I cover later. And yes, it is always possible that the latest Priority-1 Bug Fix introduced a new defect, so you may want to do a little more testing.

3. No “Show Stoppers” after a regression-test run, or a waiting period

Earlier I mentioned a regression-test run, which I might call a “cadence.” This a responsible plan for testing; some shops might call it a test cycle. The idea is that when there are no “blocker” bugs known, and a test cycle completes, and you still have no “blockers,” well, then, now you can ship.

Likewise, if you have some beta test process in which the software is staged and used (provisionally) by actual users, you could ship when some period of time has passed without anyone discovering a new “blocker” bug. This length of time is generally related to the development cycle. For example, Microsoft might have a six-month beta for a new operating system, but if your team ships a new build every month, your code might only be on staging for a week or two.

Each of these methods means less risk (and generally, more quality) for the end customer – and they cost you an increasingly large amount of time and money. The basic tradeoff is the amount of risk to the company is versus the cost of shipping late. That cost could mean lost third quarter sales, a broken contract, or opportunity cost. After all, if the team ships one product today, they can get started on another product tomorrow.

Most of the teams I work with try to do “No show-stoppers after a regression run,” but if a bug fix is trivial or low risk they may do a very light final regression run.

#4. Consensus

Some software teams I’ve worked with have staff with advanced degrees and decades of software experience. Those developers and testers knew the customer extremely well, or actually were the target audience. It seems strange that we would lock ourselves into a policy like something above if the whole team just “doesn’t quite feel right,” doesn’t it?

If your organization is profitable, doesn’t have a revenue gun in its back to ship on a deadline, and the technical staff has incentives that match the long-term goals of the business, the consensus method might just be the best choice for you.

If the team has the right level of confidence (and especially if it has a financial stake in the long-term outcome), you might just take a vote. If the risk to the company is high, you can require more than a simple majority; before you agree to ship the software, you could require two-thirds or all-hands to agree. Knowing that they get to make the decision can be a huge thing to help a team gel and take ownership.

I’ve been told a few times that this could never work, yet ID Software, one of the more profitable and well-known game makers has an old slogan: “The software will ship when it’s done.”

There are other ways to make the ship/no-ship decision. You could assign a weight to every bug and say the total score has to be below some bar. Yet in my experience, arbitrary and complex metrics tend to be over-ruled by political or organizational pressures.

Which Method Should I Use?

Some methods are out of your control; you ship when the big boss says so. Perhaps you can inform executive management of the risks, perhaps the team can trade off some scope or defer some work, but when the CEO makes the call, the decision is pretty easy. Even if you don’t like it that way.

Yet I find that corporate cultures are in the middle of a transition, moving away from command and control and toward multi-disciplinary work teams. Companies like Amazon.com are increasingly replacing the “single wringable neck” with self-organized, self-directed work teams that can decide for themselves when to ship – which can put a lot of weight on your shoulders.

If you are in that situation, or if you are the single wringable neck at a more conservative company, you should give serious thought to your release practices. By developing a release cadence, understanding how the customer will use the software, and taking a hard look at the known defect list, you can make a best estimate of the software’s usability as well as the consequences of not releasing the software, such as lost revenue.

As the saying goes, it’s a tough job... but that’s why they pay us the big bucks.