Hi,
I repeatedly deleted and reinstalled data virtualization.
It works well at the moment it is installed, however, it did not work as service every after the machines restarted.
Even if I restart dv-0 pod, the status remains 2/3.
Here is information about it. *IP is masked.
---------------------------------------------
[root@master01 ~]# oc get pod |grep dv
dv-0 2/3 CrashLoopBackOff 20 1h
dv-caching-5c5c8966b7-x549k 0/1 Running 0 1m
dv-worker-0 0/1 Running
---------------------------------------------
It seems the dv-server is not ready status.
---------------------------------------------
[root@master01 ~]# oc describe pod dv-0
Name: dv-0
Namespace: icp4d
Priority: 0
PriorityClassName: <none>
Node: woker01/xx.xxx.x.x
Start Time: Mon, 17 Feb 2020 11:55:22 +0900
Labels: app.kubernetes.io/component=head
app.kubernetes.io/instance=dv-0-1581558838223
app.kubernetes.io/managed-by=Tiller
app.kubernetes.io/name=dv
bigsql=bigsql
controller-revision-hash=dv-ffd8b865f
helm.sh/chart=ibm-dv
release=dv-0-1581558838223
statefulset.kubernetes.io/pod-name=dv-0
Annotations: openshift.io/scc=dv-scc
productID=ICP4D-IBMDataVirtualizationv1300_00000
productName=IBM Data Virtualization
productVersion=1.3.0.0
Status: Running
IP: xxx.xx.xx.xx
Controlled By: StatefulSet/dv
Containers:
dv-mariadb:
Container ID: cri-o://f3cf1f6b09a5fd999e86e592bc04f11679d8997576e7f7280dc9af739039b2b4
Image: cp.icr.io/cp/cpd/dv-mariadb:v1.3.0.0-301
Image ID: cp.icr.io/cp/cpd/dv-mariadb@sha256:1db01b8a7614274754911fd113dc382e1c4e45e0fa2749d4b4ce51731361fb56
Port: 3306/TCP
Host Port: 0/TCP
State: Running
Started: Mon, 17 Feb 2020 11:55:28 +0900
Ready: True
Restart Count: 0
Limits:
cpu: 1
memory: 256Mi
Requests:
cpu: 1
memory: 128Mi
Liveness: exec [/opt/dv/current/mariadb-scripts/mariadb_liveness.sh] delay=0s timeout=1s period=10s #success=1 #failure=10
Readiness: exec [/opt/dv/current/mariadb-scripts/mariadb_readiness.sh] delay=0s timeout=1s period=10s #success=1 #failure=10
Environment: <none>
Mounts:
/mnt/PV/versioned from dv-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from dv-sa-token-5vdmj (ro)
dv-server:
Container ID: cri-o://b8b50e524e571d993410b3c8b47e244bc808080300ed49e5fcbab427aeca65f3
Image: cp.icr.io/cp/cpd/dv-engine:v1.3.0.0-301
Image ID: cp.icr.io/cp/cpd/dv-engine@sha256:366c9e4347c0771dfc6603f061a434775e3a43f0580e55a0cf472931f6990835
Ports: 7777/TCP, 32051/TCP, 32052/TCP, 33001/TCP
Host Ports: 0/TCP, 0/TCP, 0/TCP, 0/TCP
Command:
/opt/dv/current/dv-cli.sh
-o
start-dv
--keep-alive
State: Waiting
Reason: CrashLoopBackOff
Last State: Terminated
Reason: Error
Exit Code: 1
Started: Mon, 17 Feb 2020 13:02:42 +0900
Finished: Mon, 17 Feb 2020 13:02:43 +0900
Ready: False
Restart Count: 18
Limits:
cpu: 14
memory: 25Gi
Requests:
cpu: 14
memory: 25Gi
Liveness: exec [/bin/bash -c /opt/dv/current/liveness.sh] delay=300s timeout=10s period=120s #success=1 #failure=10
Readiness: exec [/bin/bash -c -- test -f /mnt/marker_files/.bar; bar_in_progress=$?; nc localhost 32051 < /dev/null; jdbc_ready=$?; t est -f /mnt/marker_files/.dv_initialized; dv_svc_ready=$?; if [[ ${bar_in_progress} -eq 0 ]]; then exit 0; elif [[ ${jdbc_ready} -eq 0 ]] && [[ ${dv_svc_ready} -eq 0 ]]; then exit 0; else exit 1; fi] delay=120s timeout=1s period=10s #success=1 #failure=12
Environment:
IS_KUBERNETES: true
Mounts:
/etc/internal-nginx-svc-tls from internal-nginx-svc-tls (ro)
/etc/secret-volume from secret-volume (ro)
/etc/wdp-service-id-secret-volume from wdp-service-id-secret-volume (ro)
/mnt/PV/unversioned from dv-caching-data (rw)
/mnt/PV/versioned from dv-data (rw)
/mnt/PV/versioned/uc_dsserver_shared from dv-data (rw)
/mnt/PV/versioned/unified_console_data from dv-data (rw)
/tmp/container_resources from dv-pod-info (rw)
/var/log from dv-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from dv-sa-token-5vdmj (ro)
dv-opensource:
Container ID: cri-o://1c49e9cafbc24a0ffc8c3034ccfdf869e2a0b6903cc66eebf460396bf6b1867c
Image: cp.icr.io/cp/cpd/dv-engine:v1.3.0.0-301
Image ID: cp.icr.io/cp/cpd/dv-engine@sha256:366c9e4347c0771dfc6603f061a434775e3a43f0580e55a0cf472931f6990835
Port: <none>
Host Port: <none>
Command:
/opt/dv/current/opensource-services-util.sh
-o
start
--keep-alive
State: Running
Started: Mon, 17 Feb 2020 12:45:58 +0900
Last State: Terminated
Reason: Error
Exit Code: 137
Started: Mon, 17 Feb 2020 12:19:58 +0900
Finished: Mon, 17 Feb 2020 12:45:57 +0900
Ready: True
Restart Count: 2
Limits:
cpu: 1
memory: 5Gi
Requests:
cpu: 1
memory: 4Gi
Liveness: exec [/opt/dv/current/opensource-services-util.sh -o check-liveness] delay=300s timeout=15s period=120s #success=1 #failur e=10
Readiness: exec [/opt/dv/current/opensource-services-util.sh -o check-readiness] delay=0s timeout=15s period=120s #success=1 #failure =10
Environment:
IS_KUBERNETES: true
Mounts:
/mnt/PV/unversioned from dv-caching-data (rw)
/mnt/PV/versioned from dv-data (rw)
/tmp/opensource_container_resources from dv-pod-info-opensource (rw)
/var/log from dv-data (rw)
/var/run/secrets/kubernetes.io/serviceaccount from dv-sa-token-5vdmj (ro)
Conditions:
Type Status
Initialized True
Ready False
ContainersReady False
PodScheduled True
Volumes:
dv-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: dv-pvc
ReadOnly: false
dv-caching-data:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: dv-caching-pvc
ReadOnly: false
dv-pod-info:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
requests.memory -> mem_request
dv-pod-info-opensource:
Type: DownwardAPI (a volume populated by information about the pod)
Items:
requests.memory -> mem_request
secret-volume:
Type: Secret (a volume populated by a Secret)
SecretName: dv-secret
Optional: false
wdp-service-id-secret-volume:
Type: Secret (a volume populated by a Secret)
SecretName: wdp-service-id
Optional: true
internal-nginx-svc-tls:
Type: Secret (a volume populated by a Secret)
SecretName: internal-nginx-svc-tls
Optional: true
dv-sa-token-5vdmj:
Type: Secret (a volume populated by a Secret)
SecretName: dv-sa-token-5vdmj
Optional: false
QoS Class: Burstable
Node-Selectors: node-role.kubernetes.io/compute=true
Tolerations: node.kubernetes.io/memory-pressure:NoSchedule
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal Scheduled 1h default-scheduler Successfully assigned icp4d/dv-0 to woker01
Normal Pulled 1h kubelet, worker01 Container image "cp.icr.io/cp/cpd/dv-mariadb:v1.3.0.0-301" already present on machine
Normal Created 1h kubelet, worker01 Created container
Normal Started 1h kubelet, worker01 Started container
Normal Pulled 1h kubelet, worker01 Container image "cp.icr.io/cp/cpd/dv-engine:v1.3.0.0-301" already present on m achine
Normal Created 1h kubelet, worker01 Created container
Normal Started 1h kubelet, worker01 Started container
Normal Pulled 1h (x4 over 1h) kubelet, worker01 Container image "cp.icr.io/cp/cpd/dv-engine:v1.3.0.0-301" already present on m achine
Normal Created 1h (x4 over 1h) kubelet, worker01 Created container
Normal Started 1h (x4 over 1h) kubelet, worker01 Started container
Normal Killing 21m (x2 over 47m) kubelet, worker01 Killing container with id cri-o://dv-opensource:Container failed liveness prob e.. Container will be killed and recreated.
Warning BackOff 1m (x316 over 1h) kubelet, worker01 Back-off restarting failed container
---------------------------------------------
As for the environment, I installed it on 3 masters and 4 workers using NFS(1TB) not Portworx.
Could you give me advice? Could someone tell me the cause of this?
Kind Regards,
Chirs
------------------------------
Chris
------------------------------
#CloudPakforDataGroup