Instana

 View Only

Instana Wants To Help You Resolve Issues – Faster!

By Jeff Hamilton posted Wed August 31, 2022 12:00 AM

  

The content on this page was originally published on Instana.com and has been migrated to the community as a historical asset. As such, it may contain outdated information on our products and features. Please comment if you have questions about the content. 

When your application users are having a bad day, it stands to reason that you are also having a bad day, especially if you don’t know what is causing their pain or how to go about resolving the issue.  Instana is very interested in making your life easier in this regard.   Follow us on a role playing journey into the three aspects of helping to turn your users’ bad day around.

First, let’s play the role of a news reporter – What’s wrong in the world today?

Luckily you are already running Instana, and thus you already have some things going for you to know what’s wrong for your users.  You have auto-discovery across your infrastructure, middleware, and programming languages in your cloud and on-premise environment, virtually eliminating any blind spots to finding problems.

You also have 100% capture of all your end-user transactions, so for any specific customer issue, you are aware of the experience and are alerted when they are not having a good experience.

From a data capture perspective, you are capturing metrics at a one second granularity along with related logs, so when Instana alerts you to a problem – you are already have all of the relevant metrics, traces, and logs correlated in real time.  Instana also alerts within three seconds of a problem, so we are giving you as much time as possible to save the day for your users.  Whew – you are ahead of the game already!

Let’s play detective – Who did it?

Assigning a detective is the first step in any investigation.  Who is looking into this problem?  Do you have an application SRE team or a Central IT Ops team looking at this? Or, are your application developers on call for issues like this?  Assigning the single detective may sound trivial but in many cases coordinating the investigation of an issue and being able to hand it off cleanly to others is a big part of resolving issues quickly.

Gathering all the facts is the next step.  What was happening at the time?  Does this look like a past incident?  This usually involves gathering traces and logs and running diagnostics or tests to find out as much as possible about how we got here.

Now, it’s time to eliminate suspects and find out who the culprit is.  For some issues this may be trivial but for others this may involve sifting through traces and logs and pulling in witnesses as well as subject matter experts to weigh in.  Once we get past any finger pointing and lay out/follow the evidence…we can solve the case and nab the perpetrator (isolate the problem)!!!

Let’s play mechanic – How do we fix it?

We’re happy that we have isolated the problem (high fives all around).  Now, how do we fix it?  For recurring problems there may be a known or obvious fix.  However, this may be a brand new problem or it may have been a problem seen a long time ago or fixed by someone else, so we don’t know what steps were done to fix it before.  Knowing what particular script, tool, or runbook was used, who used it, and if it worked, can save multiple trips to the repair shop (attempts) and thus drastically reduce our time to resolve the issue.  AI can also be a huge help here not only in identifying similar problems and their solutions, but also in recommending fixes to new problems if the symptoms or root cause are similar.

Instana remediation and resolving your issues faster – How can we help?

We at Instana are currently looking at how we can help streamline the steps involved in both isolating problems and the actions required to fix it.  We have delivered on automating observability and getting our problem notification down to three seconds, thus we are in a unique position of being able to help orchestrate and use AI to assist with remediation to reduce that time next for you.

For users who also rely on other APM and Infrastructure monitoring tools, Watson AIOps is the tool of choice to resolve these operational challenges using the assistance of AI.  We are looking to bring this benefit to Instana directly for users who are fully embracing Instana for an application and its infrastructure.

We would love to understand how you are solving these challenges today, and also have solutions for how we can improve on this to resolve your issues faster.   If you would be interested in helping us with our research in this area and potentially trying out some prototype solutions then please contact jeffh@ca.ibm.com.

0 comments
18 views

Permalink