Cloud Pak for Data

 View Only
Expand all | Collapse all

catalog-api-xxxx, The Pod

  • 1.  catalog-api-xxxx, The Pod

    Posted Mon March 09, 2020 02:14 AM
    Hello.

    Every when Cloud Pak for Data runs, only two catalog-api-xxxxxxx(both xxxxxxx are random alphanumeric character) display the status of 0/1 Running.
    Are they something wrong? Or normal status?

    Regards,
    Chris

    ------------------------------
    Chris
    ------------------------------

    #CloudPakforDataGroup


  • 2.  RE: catalog-api-xxxx, The Pod

    Posted Mon March 09, 2020 04:46 AM

    Hi,

    All of pods should be in state of 1/1 (more generally X/X).

    Can you look into the logs of the pods ?

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 3.  RE: catalog-api-xxxx, The Pod

    Posted Mon March 09, 2020 06:28 AM
    Hello,

    Thank you for your reply.


    Here is another information about the pods.
    *A part of information such as IP, port, and username is masked for security.

    [xxxx@xxx01 ~]# oc describe po catalog-api-757f794598-8sjbw
    Name: catalog-api-757f794598-8sjbw
    Namespace: icp4d
    Priority: 0
    PriorityClassName: <none>
    Node: woker03/10.xxx.x.x
    Start Time: Mon, 09 Mar 2020 14:03:43 +0900
    Labels: app=catalog-api
    chart=catalog-api-charts-0.1.241
    heritage=Tiller
    pod-template-hash=3139350154
    release=0027-wkc-base
    wkc=wkc
    Annotations: openshift.io/scc=restricted
    productID=ICP4D-WKCBase-Prod-00000
    productId=wkc
    productName=Watson Knowledge Catalog Base Services
    productVersion=3.0
    Status: Running
    IP: 1xx.xx.x.xx
    Controlled By: ReplicaSet/catalog-api-757f794598
    Containers:
    catalog-api:
    Container ID: cri-o://b4c27cf111c097fb24318452cec4c589287ecae115d354804c85aeee4b03ff0c
    Image: cp.icr.io/cp/cpd/catalog_master:2.0.0-20191030004426-9bd7d3f
    Image ID: cp.icr.io/cp/cpd/catalog_master@sha256:1541d393e10a766e852e22301b6e2502df5597e758b0f2405b41c968d7849c71
    Port: 9xxx/TCP
    Host Port: 0/TCP
    State: Running
    Started: Mon, 09 Mar 2020 14:48:11 +0900
    Last State: Terminated
    Reason: Error
    Exit Code: 137
    Started: Mon, 09 Mar 2020 14:26:11 +0900
    Finished: Mon, 09 Mar 2020 14:48:10 +0900
    Ready: False
    Restart Count: 2
    Limits:
    cpu: 2500m
    memory: 4Gi
    Requests:
    cpu: 300m
    memory: 512Mi
    Liveness: http-get https://:9xxx/v2/catalogs/heartbeat%3FdependencyChk=false delay=960s timeout=30s period=120s #success=1 #failure=3
    Readiness: http-get https://:9xxx/v2/catalogs/heartbeat%3FdependencyChk=false delay=60s timeout=30s period=30s #success=1 #failure=30
    Environment:
    DATALAKE_DBCONF_DIR: /opt/ibm/wlp/usr/servers/defaultServer/resources/
    environment: prod
    environment_type: wkc
    base_url: <set to the key 'host-url' of config map 'wdp-config'> Optional: false
    file_service_url: $(base_url)/v2/asset_files
    wml_url: $(base_url)
    project_api_url: $(base_url)/v2/projects
    connection_api_url: $(base_url)/v2
    entitlement_api_url: $(base_url)/v2/entitlements
    dps_url: $(base_url)/v2
    global_search_index_url: $(base_url)/v3/search_index
    url_override: /v2
    lineage_url: $(base_url)/v2/lineage_events
    WDP_SERVICE_ID_CREDENTIAL: <set to the key 'service-id-credentials' in secret 'wdp-service-id'> Optional: false
    WDP_SERVICE_ID: <set to the key 'service-id' in secret 'wdp-service-id'> Optional: false
    accredited_service_metering_01: $(WDP_SERVICE_ID)
    accredited_service_editors_01: $(WDP_SERVICE_ID)
    skip_new_owner_check: $(WDP_SERVICE_ID)
    accredited_service_viewers_01: $(WDP_SERVICE_ID)
    global_type_creator_service_id_01: $(WDP_SERVICE_ID)
    global_asset_type_server_creator_id: $(WDP_SERVICE_ID)
    cams_administration_editors_01: $(WDP_SERVICE_ID)
    cams_administration_viewers_01: $(WDP_SERVICE_ID)
    cams_operators_01: $(WDP_SERVICE_ID)
    wkc_account_managers_01: $(WDP_SERVICE_ID)
    dps_skipped_services: $(WDP_SERVICE_ID)
    cams_omrs_asset_administrator: $(WDP_SERVICE_ID)
    CLOUDANT_USER: <set to the key 'username' in secret 'wdp-cloudant-creds'> Optional: false
    CLOUDANT_PASSWORD: <set to the key 'password' in secret 'wdp-cloudant-creds'> Optional: false
    rabbitmq_uri: <set to the key 'rabbitmq-url.txt' in secret 'rabbitmq-url'> Optional: false
    icp4d_usermgmt_url: <set to the key 'icp4d-host-url' of config map 'wdp-config'> Optional: false
    ICP4D_INTERNAL_USERMGMT_URL: http://usermgmt-svc:8080
    redis_url: <set to the key 'redis-url.txt' in secret 'redis-url'> Optional: false
    Mounts:
    /etc/wdp_certs from wdp-certs (ro)
    /opt/ibm/wlp/usr/servers/defaultServer/resources from resources (rw)
    /var/run/secrets/kubernetes.io/serviceaccount from default-token-zl8kg (ro)
    Conditions:
    Type Status
    Initialized True
    Ready False
    ContainersReady False
    PodScheduled True
    Volumes:
    resources:
    Type: Secret (a volume populated by a Secret)
    SecretName: catalog-properties
    Optional: false
    wdp-certs:
    <unknown>
    default-token-zl8kg:
    Type: Secret (a volume populated by a Secret)
    SecretName: default-token-zl8kg
    Optional: false
    QoS Class: Burstable
    Node-Selectors: node-role.kubernetes.io/compute=true
    Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Scheduled 1h default-scheduler Successfully assigned icp4d/catalog-api-757f794598-8sjbw to worker03
    Normal Pulled 1h kubelet, worker03 Container image "cp.icr.io/cp/cpd/catalog_master:2.0.0-20191030004426-9bd7d3f" already present on machine
    Normal Created 1h kubelet, worker03 Created container
    Normal Started 1h kubelet, worker03 Started container
    Warning Unhealthy 3m (x112 over 1h) kubelet, worker03 Readiness probe failed: Get https://1xx.xx.x.xx:9xxx/v2/catalogs/heartbeat?dependencyChk=false: net/http: request canceled (Client.Timeout exceeded while awaiting headers)




    [xxxx@xxx01 ~]# oc describe po catalog-api-757f794598-ks5q8
    Name: catalog-api-757f794598-ks5q8
    Namespace: icp4d
    Priority: 0
    PriorityClassName: <none>
    Node: worker02/10.xxx.x.x
    Start Time: Thu, 05 Mar 2020 11:13:37 +0900
    Labels: app=catalog-api
    chart=catalog-api-charts-0.1.241
    heritage=Tiller
    pod-template-hash=3139350154
    release=0027-wkc-base
    wkc=wkc
    Annotations: openshift.io/scc=restricted
    productID=ICP4D-WKCBase-Prod-00000
    productId=wkc
    productName=Watson Knowledge Catalog Base Services
    productVersion=3.0
    Status: Running
    IP: 1xx.xx.xx.xxx
    Controlled By: ReplicaSet/catalog-api-757f794598
    Containers:
    catalog-api:
    Container ID: cri-o://02a7f43a09a6f1fbac598cc3d61cfcc2a4d6752c70a9867cffebd3af354ce225
    Image: cp.icr.io/cp/cpd/catalog_master:2.0.0-20191030004426-9bd7d3f
    Image ID: cp.icr.io/cp/cpd/catalog_master@sha256:1541d393e10a766e852e22301b6e2502df5597e758b0f2405b41c968d7849c71
    Port: 9xxx/TCP
    Host Port: 0/TCP
    State: Running
    Started: Mon, 09 Mar 2020 15:04:49 +0900
    Last State: Terminated
    Reason: Error
    Exit Code: 137
    Started: Mon, 09 Mar 2020 14:42:49 +0900
    Finished: Mon, 09 Mar 2020 15:04:45 +0900
    Ready: False
    Restart Count: 65
    Limits:
    cpu: 2500m
    memory: 4Gi
    Requests:
    cpu: 300m
    memory: 512Mi
    Liveness: http-get https://:9xxx/v2/catalogs/heartbeat%3FdependencyChk=false delay=960s timeout=30s period=120s #success=1 #failure=3
    Readiness: http-get https://:9xxx/v2/catalogs/heartbeat%3FdependencyChk=false delay=60s timeout=30s period=30s #success=1 #failure=30
    Environment:
    DATALAKE_DBCONF_DIR: /opt/ibm/wlp/usr/servers/defaultServer/resources/
    environment: prod
    environment_type: wkc
    base_url: <set to the key 'host-url' of config map 'wdp-config'> Optional: false
    file_service_url: $(base_url)/v2/asset_files
    wml_url: $(base_url)
    project_api_url: $(base_url)/v2/projects
    connection_api_url: $(base_url)/v2
    entitlement_api_url: $(base_url)/v2/entitlements
    dps_url: $(base_url)/v2
    global_search_index_url: $(base_url)/v3/search_index
    url_override: /v2
    lineage_url: $(base_url)/v2/lineage_events
    WDP_SERVICE_ID_CREDENTIAL: <set to the key 'service-id-credentials' in secret 'wdp-service-id'> Optional: false
    WDP_SERVICE_ID: <set to the key 'service-id' in secret 'wdp-service-id'> Optional: false
    accredited_service_metering_01: $(WDP_SERVICE_ID)
    accredited_service_editors_01: $(WDP_SERVICE_ID)
    skip_new_owner_check: $(WDP_SERVICE_ID)
    accredited_service_viewers_01: $(WDP_SERVICE_ID)
    global_type_creator_service_id_01: $(WDP_SERVICE_ID)
    global_asset_type_server_creator_id: $(WDP_SERVICE_ID)
    cams_administration_editors_01: $(WDP_SERVICE_ID)
    cams_administration_viewers_01: $(WDP_SERVICE_ID)
    cams_operators_01: $(WDP_SERVICE_ID)
    wkc_account_managers_01: $(WDP_SERVICE_ID)
    dps_skipped_services: $(WDP_SERVICE_ID)
    cams_omrs_asset_administrator: $(WDP_SERVICE_ID)
    CLOUDANT_USER: <set to the key 'username' in secret 'wdp-cloudant-creds'> Optional: false
    CLOUDANT_PASSWORD: <set to the key 'password' in secret 'wdp-cloudant-creds'> Optional: false
    rabbitmq_uri: <set to the key 'rabbitmq-url.txt' in secret 'rabbitmq-url'> Optional: false
    icp4d_usermgmt_url: <set to the key 'icp4d-host-url' of config map 'wdp-config'> Optional: false
    ICP4D_INTERNAL_USERMGMT_URL: http://usermgmt-svc:8080
    redis_url: <set to the key 'redis-url.txt' in secret 'redis-url'> Optional: false
    Mounts:
    /etc/wdp_certs from wdp-certs (ro)
    /opt/ibm/wlp/usr/servers/defaultServer/resources from resources (rw)
    /var/run/secrets/kubernetes.io/serviceaccount from default-token-zl8kg (ro)
    Conditions:
    Type Status
    Initialized True
    Ready False
    ContainersReady False
    PodScheduled True
    Volumes:
    resources:
    Type: Secret (a volume populated by a Secret)
    SecretName: catalog-properties
    Optional: false
    wdp-certs:
    <unknown>
    default-token-zl8kg:
    Type: Secret (a volume populated by a Secret)
    SecretName: default-token-zl8kg
    Optional: false
    QoS Class: Burstable
    Node-Selectors: node-role.kubernetes.io/compute=true
    Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
    Events:
    Type Reason Age From Message
    ---- ------ ---- ---- -------
    Normal Killing 9m (x15 over 5h) kubelet, worker02 Killing container with id cri-o://catalog-api:Container failed liveness probe.. Container will be killed and recreated.
    Warning Unhealthy 4m (x590 over 5h) kubelet, worker02 Readiness probe failed: Get https://1xx.xx.xx.xxx:9xxx/v2/catalogs/heartbeat?dependencyChk=false: net/http: request canceled (Client.Timeout exceeded while awaiting headers)


    Regards,
    Chris

    ------------------------------
    Chris
    ------------------------------



  • 4.  RE: catalog-api-xxxx, The Pod

    Posted Tue March 10, 2020 09:58 AM

    Hi,

    Can you try collecting logs (oc logs <pod> --since=1h) for each pod ?

    Please do not attach them here, just post a relevant error (if any).

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 5.  RE: catalog-api-xxxx, The Pod

    Posted Tue March 10, 2020 09:05 PM
    Hello,

    Thank you for you reply.
    I am afraid but I attached 2 logs which filtered by time and the word "fail".

    Thank you,

    ------------------------------
    Chris
    ------------------------------

    Attachment(s)

    txt
    error_wdp01.txt   134 KB 1 version
    txt
    error_wdp02.txt   698 B 1 version


  • 6.  RE: catalog-api-xxxx, The Pod

    Posted Wed March 11, 2020 06:53 AM

    Hi,

    Can you try restarting the following pods?:

    wdp-couchdb-0
    wdp-couchdb-1
    wdp-couchdb-2

    and then the catalog-api pods ?

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 7.  RE: catalog-api-xxxx, The Pod

    Posted Wed March 11, 2020 07:54 PM
    Hello,

    You mean I should execute the command below?

    oc delete po wdp-couchdb-0
    oc delete po wdp-couchdb-1
    oc delete po wdp-couchdb-2

    and then,

    oc delete po catalog-api-xxxxx(1st)
    oc delete po catalog-api-xxxxx(2nd)


    Thank you,
    Chris


    ------------------------------
    Chris
    ------------------------------



  • 8.  RE: catalog-api-xxxx, The Pod

    Posted Thu March 12, 2020 05:36 AM
    Hi,

    Yes please.
    If that won't help, please open a PMR/ticket with support.

    Thanks

    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 9.  RE: catalog-api-xxxx, The Pod

    Posted Thu March 12, 2020 11:47 PM
    Hi,

    In my CP4D environment, the path of these pods is "/v2/catalogs/heartbeat?dependencyChk=false".
    Is this same as yours?


    The result of "oc get po catalog-api-xxxxxx -o yaml"
    ...
    containers:
    - env:
    - name: DATALAKE_DBCONF_DIR
    value: /opt/ibm/wlp/usr/servers/defaultServer/resources/
    - name: environment
    value: prod
    - name: environment_type
    value: wkc
    - name: base_url
    valueFrom:
    configMapKeyRef:
    key: host-url
    name: wdp-config
    - name: file_service_url
    value: $(base_url)/v2/asset_files
    - name: wml_url
    value: $(base_url)
    - name: project_api_url
    value: $(base_url)/v2/projects
    - name: connection_api_url
    value: $(base_url)/v2
    - name: entitlement_api_url
    value: $(base_url)/v2/entitlements
    - name: dps_url
    value: $(base_url)/v2
    - name: global_search_index_url
    value: $(base_url)/v3/search_index
    - name: url_override
    value: /v2
    - name: lineage_url
    value: $(base_url)/v2/lineage_events
    - name: WDP_SERVICE_ID_CREDENTIAL
    valueFrom:
    secretKeyRef:
    key: service-id-credentials
    name: wdp-service-id
    - name: WDP_SERVICE_ID
    valueFrom:
    secretKeyRef:
    key: service-id
    name: wdp-service-id
    - name: accredited_service_metering_01
    value: $(WDP_SERVICE_ID)
    - name: accredited_service_editors_01
    value: $(WDP_SERVICE_ID)
    - name: skip_new_owner_check
    value: $(WDP_SERVICE_ID)
    - name: accredited_service_viewers_01
    value: $(WDP_SERVICE_ID)
    - name: global_type_creator_service_id_01
    value: $(WDP_SERVICE_ID)
    - name: global_asset_type_server_creator_id
    value: $(WDP_SERVICE_ID)
    - name: cams_administration_editors_01
    value: $(WDP_SERVICE_ID)
    - name: cams_administration_viewers_01
    value: $(WDP_SERVICE_ID)
    - name: cams_operators_01
    value: $(WDP_SERVICE_ID)
    - name: wkc_account_managers_01
    value: $(WDP_SERVICE_ID)
    - name: dps_skipped_services
    value: $(WDP_SERVICE_ID)
    - name: cams_omrs_asset_administrator
    value: $(WDP_SERVICE_ID)
    - name: CLOUDANT_USER
    valueFrom:
    secretKeyRef:
    key: username
    name: wdp-cloudant-creds
    - name: CLOUDANT_PASSWORD
    valueFrom:
    secretKeyRef:
    key: password
    name: wdp-cloudant-creds
    - name: rabbitmq_uri
    valueFrom:
    secretKeyRef:
    key: rabbitmq-url.txt
    name: rabbitmq-url
    - name: icp4d_usermgmt_url
    valueFrom:
    configMapKeyRef:
    key: icp4d-host-url
    name: wdp-config
    - name: ICP4D_INTERNAL_USERMGMT_URL
    value: http://usermgmt-svc:xxxx
    - name: redis_url
    valueFrom:
    secretKeyRef:
    key: redis-url.txt
    name: redis-url
    image: cp.icr.io/cp/cpd/catalog_master:2.0.0-20191030004426-9bd7d3f
    imagePullPolicy: IfNotPresent
    livenessProbe:
    failureThreshold: 3
    httpGet:
    path: /v2/catalogs/heartbeat?dependencyChk=false
    port: 9443
    scheme: HTTPS
    initialDelaySeconds: 960
    periodSeconds: 120
    successThreshold: 1
    timeoutSeconds: 30
    name: catalog-api
    ports:
    - containerPort: 9443
    protocol: TCP
    readinessProbe:
    failureThreshold: 30
    httpGet:
    path: /v2/catalogs/heartbeat?dependencyChk=false
    ​...

    Thank you,
    Chris

    ------------------------------
    Chris
    ------------------------------



  • 10.  RE: catalog-api-xxxx, The Pod

    Posted Fri March 13, 2020 06:26 AM

    Hi,

    Yes this will be consistent across all the installations.

    Thanks



    ------------------------------
    TOMASZ HANUSIAK
    ------------------------------



  • 11.  RE: catalog-api-xxxx, The Pod

    Posted Sun March 15, 2020 09:03 PM
    Hi, Tomasaz,

    Thank you for your confirmation.

    As for this problem, I will wait and see for a while.
    For my environment is development one.

    Thank you,
    Chris

    ------------------------------
    Chris
    ------------------------------