As we strive to understand generative AI and foundation models and how they can help IT Operations with the mission of sustaining the business, we need ways to evaluate when it can help and when it can hurt. Our amazement of the eloquence of models we can interact with today lasts until you start probing into how the model knows that, and can it explain its reasoning or sources. For “normal” AI, explainability and confidence are key to deciding if a model is suited for a specific problem. Responses akin to ”I'm sorry Dave, I'm afraid I can't [tell you] that!” can make you feel like you are at an impasse.
When our CEO Arvind Krishna spoke to us recently, he talked about how to move with urgency, prioritization and the appropriate level of process simplicity. The topic was both in general but also touched on how we explore this latest AI wave.
My takeaway was this is a useful lens to look through when assessing IT operations challenges. To focus on how critical or tolerant the task is to accuracy and how important the reliability of the outcome is to the business goal. For example, if the as-is response to a situation is to attempt a resolution, gauge its effectiveness and adjust to resolve it if needed, then the as-is process is tolerant to error.
I was reminded of a challenge from one of my early products. In the document management space, when scanning and optical character recognition was in its infancy, the accuracy percentage was the main gauge of value. Vendors spent a lot of effort trying to reach 100% accuracy using multiple techniques individually and together. 90% was relatively easily achieved but each additional percent met the law of diminishing returns. Adoption remained modest and away from mission critical business processes.
In customer engagements the turning point was when we switched away from discussing “how accurate is your OCR?” and moved to “can you improve on how my team is doing?” It turned out that the error rate for human transcription in most of the target customers was already on a par with OCR. The winning conversation however, was moving away from technical accuracy of OCR alone. Instead targeting the accuracy of the end to end process. Combining machine OCR with human review, guided by OCR-identified uncertainties enabled customers to combine analysis and human review to achieve 99%+ business accuracy in many cases.
So, in your hunt for problems that generative AI and foundation models can help solve, think about the outcome, and if a AI/human hybrid approach is the optimal answer. Perhaps a tier one service fault requires 99% certainty with human oversight before acting but lower service tiers can tolerate an AI generated initial response with retries and human review on failure.
Happy hunting