In the last post, I addressed the environment configuration area as one area to consider before the software solution is considered production-ready. In this post, I will address the production environment change capability area of concern. If this area of concern is not addressed before the software is deployed in production, then the software is not really production-ready.
A. Production Change Capability
While the software solution is running in production, a change for any production stack will often become required. The change may be to fix a bug, to add a feature, or to help implement a certain business or IT directive. The change may be to upgrade a part of the environment or to tweak some resource at any production layer. The change may be required shortly after the solution goes into production or years afterward. The change is primarily categorized in one of two categories: software code changes or environmental changes. The capability of making such a change is important to consider for software production readiness. That’s because when the software is deployed in production, events that necessitate changes will happen. For example, a software bug may go undetected in the pre-production phases. An upgrade for a part of the software environment may be necessary. For example, upgrading WebSphere Application Server level or database level to a newer level. Provisioning more CPU, RAM, or disk space is another example of a change. Tuning certain resources in one or more environment layers in preparation for a spike in workload is another example of a change. If the software is deployed in production, but such changes are difficult or very time-consuming to make, customer satisfaction problems, financial loss, or negative business reputation can result, to say the least.
Taking the software change from the source code repository all the way to production is a complex process involving human expertise in various IT skills and tools to help streamline the process. At this time, any of the following DevOps terms may come to mind: continuous integration, continuous testing, continuous deployment, or continuous delivery. I tried to avoid these terms in the title of this section as many enterprises do not have a sophisticated end-to-end software delivery automation pipeline and the capability of making such changes is still critical for such enterprises as well. Or, the software lifecycle is not agile in nature, and some changes do not have to be made quickly. As a matter of fact, enterprises may be at various maturity levels of DevOps. That means, an enterprise may have adopted one or more DevOps principles. The faster you need to deliver the change to production, the more principles of DevOps you must adopt before you deploy the software in production. How many principles of DevOps should be adopted depends largely on the availability requirements of the software in this case. The closer the availability to 100%, the more DevOps principles must be adopted. Although DevOps is not the topic of discussion here, I must say that DevOps principles may be adopted for other reasons as well. For example, software delivery efficiency and cutting costs may be the main driver behind adopting DevOps principles.
To illustrate why having the capability to make a change in production quickly is considered critical for production readiness for certain cases, I will now provide a couple of real-world use cases. A few years ago, a customer deployed their application to production. Then, the customer started asking: How do we deliver changes to the production environment with minimal disruptions? For example, how do I upgrade the solution environment without downtime? This kind of question and many others should have been asked before the solution is deployed in production. That’s because in the process of answering the question, you may find out that the production environment must be designed in such a way that it will support change delivery with minimal disruptions.
More recently, I was conducting a production readiness assessment. During the assessment, I found that the environment had single points of failure which lead to potentially many problems. But, one of these problems is that upgrading a component that is a single point of failure will often lead to an outage. If the environment is being used by the users all the time, such an outage will be disruptive.
In brief, the ability to deliver changes to production in a timely manner is a measure of production readiness.