Containers, Kubernetes, OpenShift on Power

 View Only
  • 1.  Error during OpenShift Installation in Power 9 Server

    Posted Thu January 23, 2025 09:33 AM

    Hi

    I'm reaching out for assistance with an issue I encountered during the installation of a 3-node OpenShift 4.17.11 cluster on an IBM Power 9 Server.

    The installation process seems to proceed normally up to the Finalizing Stage, but then it takes an unusually long time and eventually fails, resulting in an error. I attempted a fresh reinstallation, but the issue persists, ending with the same outcome.
    Below is the screenshot of cluster events



    ------------------------------
    Anup Regmi
    ------------------------------


  • 2.  RE: Error during OpenShift Installation in Power 9 Server

    Posted Thu January 23, 2025 09:42 AM

    Hey Anup, I think you have a problem with a load balancer. I don't think you have everything setup. Authentication and Console depend on the ingress being fully setup. If you have more information on these `oc get co ingress` - it should help tell us what is going on. Thanks, Paul



    ------------------------------
    PAUL BASTIDE
    Senior Software Engineer
    IBM
    ------------------------------



  • 3.  RE: Error during OpenShift Installation in Power 9 Server

    Posted Fri January 24, 2025 01:57 AM
    Edited by Anup Regmi Fri January 24, 2025 02:13 AM

    Hi Paul, Thanks for your response.

    Looks like that is the issue, I was installing using assisted installation and thought RedHat will take care of those things.
    Below is the status and cause of failure:

    The "default" ingress controller reports Degraded=True: DegradedConditions: One or more other status conditions indicate a degraded state: CanaryChecksSucceeding=False (CanaryChecksRepetitiveFailures: Canary route checks for the default ingress controller are failing. Last 2 error messages:
          error sending canary HTTP Request: Timeout: Get "https://canary-openshift-ingress-canary.apps.ppcocp.thakralone.lab": context deadline exceeded (Client.Timeout exceeded while awaiting headers)
          error sending canary HTTP request: DNS error: Get "https://canary-openshift-ingress-canary.apps.ppcocp.thakralone.lab": dial tcp: lookup canary-openshift-ingress-canary.apps.ppcocp.thakralone.lab on 172.30.0.10:53: no such host (x5932 over 98h52m4s))


    While checking status of the route mentioned above, got responses as below:

    #oc get route canary -n openshift-ingress-canary
    NAME     HOST/PORT                                                    PATH   SERVICES         PORT   TERMINATION            WILDCARD
    canary   canary-openshift-ingress-canary.apps.ppcocp.thakralone.lab          ingress-canary   8443   passthrough/Redirect   None
     
    #oc get svc -n openshift-ingress-canary
    NAME             TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
    ingress-canary   ClusterIP   172.30.183.207   <none>        8443/TCP,8888/TCP   6d19h
     
    #oc get endpoints ingress-canary -n openshift-ingress-canary
    NAME             ENDPOINTS                                                       AGE
    ingress-canary   10.128.0.86:8443,10.129.0.58:8443,10.130.0.6:8443 + 3 more...   6d19h
    Also, looking at openshift-dns seems all required pods are running.

    Can you guide on how we can fix this error?



    ------------------------------
    Anup Regmi

    ------------------------------



  • 4.  RE: Error during OpenShift Installation in Power 9 Server

    Posted Fri January 24, 2025 08:14 AM
    Hi Anup,

    The OpenShift on Power team built the automation https://github.com/ocp-power-automation/ocp4-ai-power


    Let us know if you have questions after consulting the setup.

    Thank you

    Paul


    ------------------------------
    PAUL BASTIDE
    Senior Software Engineer
    IBM
    ------------------------------



  • 5.  RE: Error during OpenShift Installation in Power 9 Server

    Posted Fri January 24, 2025 10:01 AM

    Hi Anup,

    The Assisted install for power does not support Cluster managed network, only User Managed network works, so the bastion is required to be setup for OCP installation. Have you do that before installation with Assisted installer? Also what power env do you use, HMC only or PowerVC?  You can reference the doc Paul referenced above for bastion setup.

    Thanks.

    C. Zhang



    ------------------------------
    CHONGSHI ZHANG
    ------------------------------



  • 6.  RE: Error during OpenShift Installation in Power 9 Server

    Posted 28 days ago

    Hi

    I have HMC only environment, I will try installing as per Paul's suggestion. Hope SNO is supported as well.

    Thanks for your response.



    ------------------------------
    Anup Regmi
    ------------------------------



  • 7.  RE: Error during OpenShift Installation in Power 9 Server

    Posted 28 days ago

    Hey Anup, 

    Yes, HMC is supported. If you follow https://github.com/ocp-power-automation/ocp4-upi-powervm-hmc that should guide you with Power9 and HMC.

    Yes, SNO is supported. 

    Chongshi, who responded earlier, is our lead developer. Feel free to ask questions here .

    Thank you,

    Paul



    ------------------------------
    PAUL BASTIDE
    Senior Software Engineer
    IBM
    ------------------------------



  • 8.  RE: Error during OpenShift Installation in Power 9 Server

    Posted 23 days ago

    Hi Anup,

    To install SNO on Power with HMC, you need to use this playbook https://github.com/cs-zhang/ocp4-upi-sno, it is designed for SNO only installation. For normal install for HMC use this playbook  https://github.com/ocp-power-automation/ocp4-upi-powervm-hmc.

    If you have any question, just let me know.

    Thanks

    C. Zhang



    ------------------------------
    CHONGSHI ZHANG
    ------------------------------



  • 9.  RE: Error during OpenShift Installation in Power 9 Server

    Posted Fri January 24, 2025 01:18 PM

    Adding to Paul's suggestion, I would recommend checking out this repo as well: https://github.com/redhat-cop/ocp4-helpernode

    I installed our physical-virtual hybrid cluster somewhat following this as an example. While I went down the "user provisioned infrastructure" path which isn't as automated of a setup, this project does detail how haproxy is configured and touches on ingress as well. 

    Another note, DNS must be working flawlessly in an OCP cluster or things won't work well, if at all.

    Best,

    Max Schmidt
    FRA - Computational Scientist

    Center for Quantitative Life Sciences
    Oregon State University



    ------------------------------
    Maximilllian Schmidt
    ------------------------------