AIOps

AIOps

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
  • 1.  Improving the Workflow

    Posted Tue April 17, 2018 04:54 PM

    How can a typical "problem determination" or "troubleshooting" workflow be improved on? In today's Hybrid Cloud environments, where systems of engagement are interacting with traditional systems of record, a lot of time can be spent trying to isolate the root cause of performance and availability problems. This results in longer MTTR and subsequently unhappy customers. What are your tips to improve that workflow?



  • 2.  RE: Improving the Workflow

    Posted Tue April 17, 2018 08:40 PM

    We have the technology now to integrate a lot of our software product logs (StdOut, StdErr, etc.).  The combination of Log streams and Watson analytics could be very powerful, but we're just getting started with these things.  So, we don't yet have a lot of real world experience.  

    I'd love to hear from any who are doing this now in production.  

     

    Regards,

    Glen Brumbaugh



  • 3.  RE: Improving the Workflow

    Posted Thu April 19, 2018 12:46 AM

    This is by no means a new problem, however, the cloud/on-prem hybrid does add a further dimension.

    This would be the areas I would focus on:

    1. Log aggregation.  Having a single source of log and related machine data is a great place to start.  Even with simple query capabilities you can establish error frequencies, locations and lead up 
    2. Event management and correlation.  Having the means to reduce the noise to a sensible amount will save a lot of time.  Being told a database is down 1000 times is quite annoying, and being told all dependencies can't satisfy their requests isn't really helping.  
    3. Analytical query capability.  The bigger the better, in my opinion.  Doing basic queries is good but being able to apply more complex queries and transformations will allow greater insights, which leads nicely onto
    4. Machine Learning.  A number of tools have embraced the various algorithms now readily available.  I've seen a number of demos where ML has been able to reduce the chatter and help identify symptoms and patterns that can be counter intuitive.

     I've not heard of anyone using Watson to help here but I guess it is possible.