Containers, Kubernetes, OpenShift on Power

 View Only
Expand all | Collapse all

Exec format error with Calico CNI

  • 1.  Exec format error with Calico CNI

    Posted Fri February 10, 2023 11:53 AM

    Hello,

    I'm setting up a Kubernetes cluster with an IC922 head node and some AC922 compute nodes. I'm not using OpenShift but rather open-source Kubernetes, though I'm told that it's okay to post my question here anyway.

    After creating the cluster with kubeadm and when I try to apply Calico as the CNI, I noticed one pod that keeps failing:

    calico-system      csi-node-driver-4fm4j                                  1/2     CrashLoopBackOff   13911 (115s ago)    49d
    calico-system      csi-node-driver-vv55z                                  1/2     CrashLoopBackOff   32008 (2m25s ago)   113d

    Looking more into it, it's specifically the node-driver-registrar container within the pod:

    # kubectl logs csi-node-driver-4fm4j -n calico-system -c csi-node-driver-registrar
    exec /usr/local/bin/node-driver-registrar: exec format error

    But I can't actually get into the container to look at the file:

    # kubectl exec -it csi-node-driver-4fm4j -n calico-system -c csi-node-driver-registrar -- /bin/sh
    error: unable to upgrade connection: container not found ("csi-node-driver-registrar")

    I have tried editing the daemonset to use a different version of docker.io/calico/node-driver-registrar, but the problem remains. 

    I remember using Calico when I tried IBM Cloud Private some time ago. Does OpenShift still use Calico? 



    ------------------------------
    Yan Zhan
    ------------------------------


  • 2.  RE: Exec format error with Calico CNI

    Posted Mon February 13, 2023 08:36 AM

    `exec format error` usually indicates that there is no ppc64le container image or that inside the image the binary is not ppc64le-compatible, however calico/node-driver-registrar has ppc64le images and I also checked that node-driver-registrar is executable at least in the current master tag.

    What is the exact version/tag of node-driver-registrar you're using?

    You can also check the architecture of your image (dependant on the container engine) with e.g.

    ```bash

    docker inspect calico/node-driver-registrar:master | grep -i "Arch"

            "Architecture": "ppc64le",

    ```



    ------------------------------
    Marvin Gießing
    ------------------------------



  • 3.  RE: Exec format error with Calico CNI

    Posted Mon February 13, 2023 11:40 AM

    Hi Marvin,

    I followed https://docs.tigera.io/calico/3.25/getting-started/kubernetes/self-managed-onprem/onpremises to deploy Calico on my system.

    Looks like it pulled in docker.io/calico/node-driver-registrar:v3.24.2. It does seem to have files for ppc64le:

    # ctr image ls
    REF                                            TYPE                                                      DIGEST                                                                  SIZE     PLATFORMS                                                                                                                           LABELS
    docker.io/calico/node-driver-registrar:v3.24.2 application/vnd.docker.distribution.manifest.list.v2+json sha256:1c25bfedcfac04fd7286ab7174bb78a7fc97a50ab78a1db2e911282d49a7ac3c 11.1 MiB linux/amd64,linux/arm/v7,linux/arm64,linux/ppc64le

    However, if I try to run it, it fails with exec format error:

    # ctr run --rm -t docker.io/calico/node-driver-registrar:v3.24.2 test
    exec /usr/local/bin/node-driver-registrar: exec format error



    ------------------------------
    Yan Zhan
    ------------------------------



  • 4.  RE: Exec format error with Calico CNI

    Posted Mon February 13, 2023 01:07 PM

    Yan, can you share the output of "$ file /usr/local/bin/node-driver-registrar" - it sounds like that will be an x86-64 binary rather than a ppc64le binary.

    gerrit



    ------------------------------
    Gerrit
    ------------------------------



  • 5.  RE: Exec format error with Calico CNI

    Posted Mon February 13, 2023 04:25 PM

    Hi Gerrit,

    I'm unable to do that since the container doesn't run. 

    While I was looking around I found something intriguing. The digest for my container is 1c25bfedcfac0, which doesn't match any image under the tag v3.24.2: https://hub.docker.com/layers/calico/node-driver-registrar/v3.24.2/images/sha256-c9eabc1a350c9e0e9cf4e301f1975b97e4165c87d36b1f20a37f3fba42a4e78c. Re-pulling the image still gives the same digest.

    Also worth noting is that another container in the same pod, docker.io/calico/csi:v3.24.2 is running fine. Inspecting both containers gives ppc64le arch.



    ------------------------------
    Yan Zhan
    ------------------------------



  • 6.  RE: Exec format error with Calico CNI

    Posted Tue February 14, 2023 04:46 AM

    Here is some more information why this is happening

    $ docker run -ti --entrypoint sh --platform linux/ppc64le docker.io/calico/node-driver-registrar:v3.24.2
    / #
    / #
    / # uname -a
    Linux d9f0f7a93322 5.15.49-linuxkit #1 SMP PREEMPT Tue Sep 13 07:51:32 UTC 2022 ppc64le Linux
    / # ls -l /usr/local/bin/node-driver-registrar
    -rwxr-xr-x    1 root     root      17410539 Oct 18 22:19 /usr/local/bin/node-driver-registrar
    / # cat /etc/os-release
    NAME="Alpine Linux"
    ID=alpine
    VERSION_ID=3.16.2
    PRETTY_NAME="Alpine Linux v3.16"
    HOME_URL="https://alpinelinux.org/"
    BUG_REPORT_URL="https://gitlab.alpinelinux.org/alpine/aports/-/issues"
    / # apk update
    fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/main/ppc64le/APKINDEX.tar.gz
    fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/community/ppc64le/APKINDEX.tar.gz
    v3.16.4-2-gc2c0bdb96e5 [https://dl-cdn.alpinelinux.org/alpine/v3.16/main]
    v3.16.3-170-gb4524a0c7b8 [https://dl-cdn.alpinelinux.org/alpine/v3.16/community]
    OK: 15979 distinct packages available
    / # apk add file
    (1/2) Installing libmagic (5.41-r0)
    (2/2) Installing file (5.41-r0)
    Executing busybox-1.35.0-r17.trigger
    OK: 14 MiB in 16 packages
    / # file /usr/local/bin/node-driver-registrar
    /usr/local/bin/node-driver-registrar: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=yEcajpaWVlm_R2E3TCA7/_eRYxwZi4gHmFurSlqys/t3mqKnheDimdsein_Bl0/aeD2Yn0-PW2eurB7ZocW, not stripped
    / #

    I see image is ppc64le but not the binary which is shipped part of the image, but we don't need CSI images for the calico CNI to work, wondering how this CNI deployed in this setup? can you point us to the steps you followed to install the calico CNI?

    Note: Will create issue for this component in the Calico community and get it fixed.



    ------------------------------
    Manjunath Kumatagi
    ------------------------------



  • 7.  RE: Exec format error with Calico CNI

    Posted Tue February 14, 2023 10:09 AM

    @Manjunath Kumatagi This is very weird - I see the same however I can run the binary even it is x86:

    This is on a Power9 host:

    ```bash

    $ docker run -ti --rm --entrypoint sh docker.io/calico/node-driver-registrar:v3.24.2

    / # uname -m

    ppc64le

    / # apk update

    fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/main/ppc64le/APKINDEX.tar.gz

    fetch https://dl-cdn.alpinelinux.org/alpine/v3.16/community/ppc64le/APKINDEX.tar.gz

    v3.16.4-2-gc2c0bdb96e5 [https://dl-cdn.alpinelinux.org/alpine/v3.16/main]

    v3.16.3-170-gb4524a0c7b8 [https://dl-cdn.alpinelinux.org/alpine/v3.16/community]

    OK: 15979 distinct packages available

    / # apk add file

    (1/2) Installing libmagic (5.41-r0)

    (2/2) Installing file (5.41-r0)

    Executing busybox-1.35.0-r17.trigger

    OK: 14 MiB in 16 packages

    / # file /usr/local/bin/node-driver-registrar 

    /usr/local/bin/node-driver-registrar: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), statically linked, Go BuildID=yEcajpaWVlm_R2E3TCA7/_eRYxwZi4gHmFurSlqys/t3mqKnheDimdsein_Bl0/aeD2Yn0-PW2eurB7ZocW, not stripped

    / # node-driver-registrar --version

    node-driver-registrar v2.5.1-0-ga31bf16

    ```

    Do you know why that is...?



    ------------------------------
    Marvin Gießing
    ------------------------------



  • 8.  RE: Exec format error with Calico CNI

    Posted Tue February 14, 2023 05:22 PM

    @Marvin Gießing -- it's b/c you overrode the entrypoint from the x86 binary to `sh` and the shell binary is the correct architecture. 



    ------------------------------
    Christy Norman
    ------------------------------



  • 9.  RE: Exec format error with Calico CNI

    Posted Wed February 15, 2023 08:04 AM

    I suspect this could be because of this is running in a buildx environment where emulation enabled and running those binaries in the emulated mode.



    ------------------------------
    Manjunath Kumatagi
    ------------------------------



  • 10.  RE: Exec format error with Calico CNI

    Posted Tue February 14, 2023 09:44 AM

    You might have to follow the manifest option at the moment.

    Instructions at https://docs.tigera.io/calico/3.25/getting-started/kubernetes/self-managed-onprem/onpremises#install-calico-with-kubernetes-api-datastore-50-nodes-or-less



    ------------------------------
    Rajalakshmi Girish
    ------------------------------



  • 11.  RE: Exec format error with Calico CNI

    Posted Tue February 14, 2023 11:06 AM

    Thank you Rajalakshmi. I redeployed with the manifest option and it's working beautifully.



    ------------------------------
    Yan Zhan
    ------------------------------



  • 12.  RE: Exec format error with Calico CNI

    Posted Wed February 15, 2023 08:26 AM

    Issue https://github.com/projectcalico/calico/issues/7351 created for the same..



    ------------------------------
    Manjunath Kumatagi
    ------------------------------



  • 13.  RE: Exec format error with Calico CNI

    Posted Tue February 21, 2023 02:58 AM
    Edited by Rajalakshmi Girish Tue February 21, 2023 02:58 AM

    The storage section seems to be enabled by default.

    Adding the below section to the resource Installation makes sure that attempt to bring up csi-node pod is not made.

    spec:
      flexVolumePath: None
      kubeletVolumePluginPath: None

    https://raw.githubusercontent.com/projectcalico/calico/v3.25.0/manifests/custom-resources.yaml



    ------------------------------
    Rajalakshmi Girish
    ------------------------------