Creating incident tickets automatically once it occurs, has been done for a longtime. Is it efficient? No. Because, they are created by defining manual thresholds, which needs to be looked into time and again. Because, there is a flooding of events, contributing to flooding of tickets....
I've just published the latest in my series of articles on SRE lessons from the Lunar Landing... in this article I discuss some of the work done by flight controllers in Mission Control, their difficulties and how we'd approach this problem today. Here's a hint - AIOps, specifically Cloud Pak...
When a component in the application end-to-end workflow becomes unavailable causing impact to internal users or external clients, the clock starts ticking, and customer satisfaction can be significantly impacted. Market research firm Aberdeen pegs an outage at about $260,000/hour. And many...
At first, there were distributed computing systems, next, there were fault-tolerant systems, then, autonomic computing, and now, AI Operations. Someone once said that there is nothing new in Computer Science and that the same concepts keep coming back every few years. It’s like old wine being...
Does your team debate about IT Operations terms and their meanings? Our IBM experts can help with this handy glossary, and more importantly, why these terms matter and how they can benefit your organization. Get a useful glossary of IT Operations terms here (a 5 min read): https://www.ibm.com...