Cloud Pak for Data Group

Expand all | Collapse all

Unable to administer CPD 3.5

  • 1.  Unable to administer CPD 3.5

    Posted 30 days ago
    Hi all,

    On a new install of CPD3.5 I'm getting this error when I choose 'Manage the platform'. I'm logged in as admin and confirmed that the admin user has 'Administer platform' permission enabled. I also tried creating a new user with Administrator role but still get the same error.

    Any ideas?



    ------------------------------
    Phil Fox
    ------------------------------


  • 2.  RE: Unable to administer CPD 3.5

    Posted 29 days ago

    Hi Phil.

    Could you please try to find a pod with name like:

    zen-watchdog

    and provide us the logs?

    oc logs zen-watchdog-XXXX > log.txt

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 3.  RE: Unable to administer CPD 3.5

    Posted 29 days ago

    Hi Tomasz, thanks for taking a look. Seems like many of the pods are actually in an error state.



    ------------------------------
    Phil Fox
    ------------------------------

    Attachment(s)

    txt
    zen-watchdog-log.txt   6.32 MB 1 version
    txt
    oc-get-pods.txt   27 KB 1 version


  • 4.  RE: Unable to administer CPD 3.5

    Posted 28 days ago

    Hi,

    Yes, in fact a lot of pods/collectors are not working.

    What's causing the 500 is:

    zen-watchdog-778fb6bbb7-5gqxz                                 0/1     CreateContainerError   1          21d

    Can you please describe that pod?

    oc describe po zen-watchdog-778fb6bbb7-5gqxz 

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 5.  RE: Unable to administer CPD 3.5

    Posted 11 days ago
      |   view attached
    Hi Tomasz, Happy New Year

    As requested:
    oc describe po zen-watchdog-778fb6bbb7-5gqxz


    ------------------------------
    Phil Fox
    ------------------------------

    Attachment(s)

    txt
    oc-describe-zen.txt   7 KB 1 version


  • 6.  RE: Unable to administer CPD 3.5

    Posted 10 days ago
    Hi,

    Please note, that the pod restarted a lot over the last few weeks:
    (x142209 over 21d)

    Some events/details may got removed.
    Can you try to delete that pod (oc delete pod ....), it will come back with a slightly different name.

    Please collect describe of the new pod + `oc get events --sort-by='{.lastTimestamp}' ( or oc get events --sort-by=.metadata.creationTimestamp)

    Thanks

    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 7.  RE: Unable to administer CPD 3.5

    Posted 10 days ago
    Hi,  

    oc describe pod zen-watchdog-778fb6bbb7-shqjm > oc_describe.txt
    oc get events --sort-by='{.lastTimestamp}' > oc_events.txt


    Thanks,

    ------------------------------
    Phil Fox
    ------------------------------

    Attachment(s)

    txt
    oc_events.txt   535 KB 1 version
    txt
    oc_describe.txt   8 KB 1 version


  • 8.  RE: Unable to administer CPD 3.5

    Posted 8 days ago

    Hi,

    I know its not much of an answer, but I see the pod & jobs working fine now.
    Can you check the UI?

    If you still face these problems, please open a ticket via IBM Support.

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 9.  RE: Unable to administer CPD 3.5

    Posted 7 days ago
    Thanks Tomasz, I appreciate the time you've taken to look into this. I am now able to pull up the Platform Management page. Could this be caused simply by a lack of resources? The cluster is a bare min config with 3 masters & 3 workers, running the wsl and wml assemblies. I was hoping the Platform Management page would give me some pointers into how 'full' the environment is, but nothing jumps out. Most of the 633 issues seem to be related to failed cron jobs (diagnostics, watchdog-alert-monitoring, zen-watchdog)



    On the OCP dashboard the only one that looks worrying is the CPU Limits Commitment @ 185.14%



    ------------------------------
    Phil Fox
    ------------------------------



  • 10.  RE: Unable to administer CPD 3.5

    Posted yesterday
    Hi,

    The over commitment is fine (to a certain degree), and thats not the cause of the issue.

    I doubt the resources were causing this, Id be more inclined towards storage issues (intermittent drop of connectivity for example)

    If you face this again, please raise a support ticket, and we should be able to pin-point the root cause.

    Thanks

    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 11.  RE: Unable to administer CPD 3.5

    Posted 8 hours ago
    Hi Tomasz,

    I am facing a kind of similar issue.
    I installed CP4D 3.5 a few weeks ago and everything was OK.
    A few days ago I noticed that I got an error when accessing Manage the platform. The zen-watchdog pod was running. I have been told to delete it.
    So I deleted it. But since then this pod cannot start successfully. I could not find any interesting message in the logs or in the events.
    Would you have any idea ?
    Many thanks in advance for your support !

    ------------------------------
    Valerie Le Roy
    ------------------------------