AIOps

AIOps

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
Expand all | Collapse all

Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

  • 1.  Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Fri April 02, 2021 02:38 PM
    I have Watson AIOps 2.0 Event Manager installed on OpenShift 4.7, and the healthcron pod cannot be created. 'oc get pods' shows it in a state of CreateContainerError. And running 'oc describe pod <pod_name>' shows this error:

    Events:
      Type Reason  Age                From Message
      ---- ------  ----               ---- -------
      Warning  Failed  64m (x533 over 3h6m)   kubelet  (combined from similar events): Error: container create failed: time="2021-04-02T17:06:12Z" level=error msg="container_linux.go:366: starting container process caused: chdir to cwd (\"/home/netcool\") set in config.json failed: permission denied"

    Any recommended fix?


    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------


  • 2.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 08:11 AM

    Hi Frank

    I can't say for certain, but what you are seeing looks strikingly similar to  https://bugzilla.redhat.com/show_bug.cgi?id=1934177    which reports to be a regression in OCP.

    I know for sure EventManager works fine on 4.4 , 4.5 and we have a recent install on 4.6.17_1533 - do you have the option of trying one of those versions and seeing if the issues you reported are no longer seen ?



    ------------------------------
    john postoyko
    IBM
    London
    ------------------------------



  • 3.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 08:49 AM
    Thanks for that information, John. I can try one of those versions now that I know one works. I actually tried OpenShift 4.6.23 (latest 4.6), and it was too unstable to even install Event Manager. So I'll destroy this cluster and create a new one at 4.6.17_1533.

    Frank

    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------



  • 4.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 04:27 PM
    Alright. I installed OpenShift 4.6.17 (I don't know where you would find or see the "_1533" part of the version number - I downloaded the only 4.6.17 available). And now topology search works. Great.

    Unfortunately, a couple of other pods have issues:

    topology-status is in CrashLoopBackoff

    and 

    topology-nasm-net-disco-collector is in CreateContainerConfigError


    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------



  • 5.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 04:30 PM
    Oh, and the healthcron pod is in the same state as the original problem, with the same error message about permissions on /home/netcool.

    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------



  • 6.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 04:57 PM

    This is where I see the _1533 part



    ------------------------------
    john postoyko
    IBM
    London
    ------------------------------



  • 7.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 05:23 PM
    I'm running OpenShift on-prem in vSphere, so my console is different than yours, and doesn't include that information. All I see anywhere is 4.6.17.

    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------



  • 8.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 04:59 PM

    and I know it doesn't help directly but my healthcron seems to be running every hour



    ------------------------------
    john postoyko
    IBM
    London
    ------------------------------



  • 9.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 05:03 PM


    topology-status is in CrashLoopBackoff

    Is there any additional information visible in the Events or Log Tab while it is in this state ?



    ------------------------------
    john postoyko
    IBM
    London
    ------------------------------



  • 10.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 05:42 PM
    These are the only messages in Events (and no errors are see in the Logs tab):



    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------



  • 11.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 05:06 PM

    topology-nasm-net-disco-collector is in CreateContainerConfigError

    I had something similar a few weeks back - I had set true for disco but forgot to assign a storage class

    I dont know if you are planning to using that feature -  if not set it to false in the yaml file



    ------------------------------
    john postoyko
    IBM
    London
    ------------------------------



  • 12.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Mon April 05, 2021 05:24 PM
    That makes sense. I did not assign a storage class.

    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------



  • 13.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Tue April 06, 2021 08:11 PM
    Just checking back with you to see where you at with this.

    ------------------------------
    john postoyko
    IBM
    London
    ------------------------------



  • 14.  RE: Watson AIOps 2.0 - NOI healthcron pod CreateContainerError

    Posted Wed April 07, 2021 10:12 AM
    OpenShift 4.6.17 seems to work the best out of the versions I've tried. Thanks for the help, John.

    ------------------------------
    Frank Tate
    Gulfsoft Consulting
    ------------------------------