During my 20 years of working in IP networks I’ve had more than a few horror stories when it comes to change control chaos. In recent years the majority of horror stories have been around cloud migration work.
In theory, these kinds of changes should be really simple. We’re moving an existing working application or service from a physical location to a public cloud environment or sometimes from an existing cloud to a new public cloud environment. There’s no software upgrade or no new patches. You simply take a working system from one location and run it in another location. But the practicalities of such work is a very different proposition. Scheduling the down time for the application system, organising the multi-disciplinary team to do the work and then, doing it all during standard maintenance windows which are inevitably at appalling times like 2am on a Sunday morning.
Let the Migration Mischief Begin
I’ve been the lead network engineer or director running the call for many cloud migration changes. And to be fair, moving the application or service is the straight forward part. However once testing starts, the real “Migration Mischief” begins.
And it’s to be expected. We’re transitioning from an on-prem environment where “everything just used to work” to a new hybrid world. Now one needs to translate from ACLs on a hardware firewall to a virtual security group and then try to figure out all of the connectivity needs without well documented applications. You test, make a policy change, test again, roll back the change, test, make a new change, and test again. Some changes help, you get a little further but there’s new issues, so you make yet more policy changes… it goes on and on until what I like to call the “Maintenance Window Madness” kicks in.
Welcome to Maintenance Window Madness
Generally at about 75% through the change process window, with the rollback point rapidly approaching, you have a decision to make. Do you push ahead thinking you’ll find the last couple of issues in the next hour or roll back now knowing you’ll have to repeat everything all over again next weekend? If you can get a new window for next weekend.
It's at this point that there’s always some bright spark on the call who suggests to copy the security group from the change two weeks ago because that worked. With the time constraint you go along with the idea. That new security group gets nested into the one you’ve been working on and presto you’ve got it working! Everybody is happy until 3-6 months later when you’re doing your next security audit and you find all these security groups nested in each other where one of them has an “allow any any” in there. Now you’ve got a real headache, you have to remove that policy for an explicit one but how many applications will fail if you touch it?
The good news is the software wizards at IBM have been working on a way to make this mischief and madness a thing of the past. IBM Hybrid Cloud Mesh is an application-centric secure overlay mesh that allows for a new dynamic where the network follows the application wherever it goes. It knows all the endpoints of the application system and when any of them move to a new location, it automatically puts the required policies in place. I can almost hear a famous child wizard say “Mischief and Madness Disappearius” and poof, all the chaos disappears.
Check out this video recording of a demo of IBM Hybrid Cloud Mesh in action and join the IBM Hybrid Cloud Mesh Community to ask questions of our experts and find the most up to date information.