Instana

Instana

The community for performance and observability professionals to learn, to share ideas, and to connect with others.

 View Only

The (Explosive) Value of Data to AI in Observability Products

By Vishnupriyan Govindarajan posted 9 hours ago

  

Key trends in GenAI

We’ve been seeing a few key trends in the LLM and Agentic AI space. 

  •          Large Language Models are getting commoditised
  •          More than 70% employees believe GenAI will change more than 30% of their work within 2 years
  •         There is a clear trend in shadow AI adoption as GenAI adoption is much faster at the employee-level than the organisation-level. Remember the Shadow IT days?
  •          Only 1% of the companies believe their GenAI investments made so far have reached maturity

Do you see any common factor that surfaces out of these trends? Yes, Data it is. Popular LLMs are trained on generic data and lacks specific context of a given business.


Client-specific data – the value tipping point to enhance GenAI

Data is the most powerful tool for GenAI. Clients need company-specific data for Gen AI and agentic AI to truly create value. It is estimated that in 2022, 90% of data generated by enterprises was unstructured, but IBM projects only 1% is accounted for in LLMs.

As rightly so discussed in the book, ‘AI Value Creators’, putting client’s context safely to work with GenAI will be the game-changer driving real business outcomes that orgs care about.



Problems with generic GenAI in observability products

Modern cloud-native observability products are rolling out agentic and GenAI experiences. Some popular use-cases are Generating investigation/remediation actions, Summarising incidents, Suggesting the probable root-cause and a Chatbot experience to query data in natural language.

But, the key question is – “Are we driving maximum business impact with such GenAI experiences that lacks a given SRE org’s context?”

Let’s discuss ‘Generating investigation/remediation’ with an example incident ‘Windows_Disk_High_Usage’. When this incident occurs, GenAI can be used to generate text-based playbook or a ansible-like runbook. Having such steps at the point of incidents can help SREs with a head-start toward faster investigation/remediation.

However, the generic GenAI based action generation has its downsides:

  •         Generated actions are only based on base-model training data
  •         Lacks client-specific context and tribal knowledge across sources like Confluence, Ansible, ServiceNow or even Slack conversations
  •         This limits trust, quality and utility of generated actions in real-time scenarios


Enhancing GenAI in observability products with client data

Leveraging client’s playbook/runbook knowledge can help scale the usefulness and relevance of generated actions.

Let’s go to our example incident ‘Windows_Disk_High_Usage’. A generic GenAI action will ask you (rightly so) to identify the files accessed by the processes that has high disk usage. But, a particular SRE team may be maintaining their internal diagnostic tool to fetch the disk I/O heatmaps which is missing in the generic GenAI.

Bringing such client’s context  to the point of incident

  •         makes the generated actions relevant and actionable
  •         helps SREs invest/remediate faster based on specific organisational knowledge
  •         puts a lot of buried /over-looked high-quality data to use

 

 


In conclusion, it is becoming increasingly evident that client-specific data can scale the adoption of GenAI real-time and create real business value by impacting MTTR and productivity.

As Instana is set to drop its 300th release, we’re super excited to continue partnering with our clients to co-create GenAI and agentic experiences by adopting trust-worthy AI practices.


#LLM
#SRE

0 comments
15 views

Permalink