IBM Destination Z - Group home

Why Should I Want to Fail Fast?

By Destination Z posted Mon December 23, 2019 03:42 PM

The phrase ‘fail fast’ is being used more these days, but what does it actually mean and is it just another buzz phrase or is it a useful concept? Let’s have a look.

Let’s start by thinking about software development in the real world. The chances are that something will go wrong. You know that, eventually, some user will press some combination of keys and everything will stop working! What I’m suggesting is that even the best coder in the world doesn’t get everything right the first time.

The thinking behind the ‘fail fast’ phrase is that if a piece of code is doomed to go wrong at some time, it’s best if it happens while you’re still writing the code and in a great position to fix it quickly and easily. It can be quite costly if a user has to report a fault in a piece of software and send details of their experience back to the developers, who may get different programmers to look at the issue and then produce a fix for what might be just part of the problem. An even more serious scenario is where the successful running of the application is dependent on the faulty code. So when the mistake is fixed, it then breaks other parts of the software. You might like to think about cost as one axis of a graph increasing exponentially, with the other axis being the length of time between a faulty piece of code being written and the error being discovered. So fail fast means that you can keep down your costs.

In addition, once a customer finds a failure in a piece of software, then the reputation of the vendor could be impacted. Although, of course, Microsoft managed to make a feature of sending out updates to its software every week. For most companies, when a problem is identified by a customer, that fault could be splashed all over Facebook groups, LinkedIn groups and Listservs. As a consequence, the vendor will need at least to produce a patch, or perhaps even to bring out a new release of the software.

And that’s why failing fast is such a good idea. In the imperfect world we live in, the cost of fixing a problem by the developers while they are still developing is quite small compared to the enormous cost (and impact on reputation) if the failure is discovered further down the line, especially once customers are involved.

Customers discovering faults and then reporting them (or, even worse, finding their own workarounds and not telling the developer) is one of the issues often associated with waterfall (sequential design) developments. The stages in this model are conception, initiation, analysis, design, construction, testing, production/implementation and maintenance. This is the PRINCE2 model.

Fail fast is usually associated with Agile models of working. With Agile, development and testing go hand in hand, which means that issues are discovered quickly and can be corrected as the project works towards a successful conclusion. Basically, Agile software development is a way of managing a project that involves numerous small improvements to a piece of software rather than starting with a massive long-term plan and working towards it. With Agile, at the end of each iteration customers and stakeholders can review the progress so far and can then re-evaluate priorities, which is why the technique is described as agile. The original plan isn’t written in stone and new choices can be made at any time during the project to ensure that customers maximize their return on investment.

Regular meetings of the whole team can help to identify any problems and the team as a whole can work on solving them. Meetings are held with everyone standing up—which focuses the mind and stops meetings dragging on for long periods of time. Agile methodologies are people-oriented rather than process-oriented. This means that the final application will be one that is usable by staff rather than those people needing to be trained to use the software.

The other term you’re likely to hear is DevOps. DevOps is more of a philosophical idea than a set of hard-and-fast rules for software development. It’s a way of working where communication and collaboration between developers and IT staff is paramount. Using a DevOps approach means organizations can deploy software more quickly than before, and the software is more likely to work as expected in a live environment. If it doesn’t work, there is less time taken to fix the problem. And a high degree of automation can be built into the software from the start. IBM offers lots of software and solutions for DevOps.

So ‘fail fast’ is not a way of building in failure to a project, it’s a way of recognizing that failures occur and managing them in a way that keeps down costs and produces software that works because it has been tested throughout its development by the kinds of people who are likely to use it. It’s a very useful concept and not just another buzz phrase to be found only in presentations.

Trevor Eddolls is CEO at iTech-Ed Ltd, an IT consultancy. A popular speaker and blogger, he currently chairs the Virtual IMS and Virtual CICS user groups. He’s editorial director for the Arcati Mainframe Yearbook, and has been an IBM Champion every year since 2009.