Thank you
@Pierre Richelle! I ran the kubectl get apic command before enabling test and monitor, then I enabled test and monitor and repeated the kubectl get apic command. These are the only differences, all the other results are the same (other than alive time for pods):
I looked the logs in the openshift-operators namespace for ibm-apiconnect-65bdf6c6cb-9gdg5, the only error I see is one related our sftp backups:
{"level":"error","ts":1642459006.085245,"logger":"sftp-client","msg":"Failed to parse SFTP BackupID into time format","backupID":"20211118-030004F_20211118-105443I","backupIDWithoutType":"20211118-030004F_20211118-105443","error":"parsing time \"20211118-030004F_20211118-105443\": extra text: \"F_20211118-105443\"","stacktrace":"github.com/go-logr/zapr.(*zapLogger).Error\n\t/go/pkg/mod/github.com/go-logr/
zapr@v0.1.0/zapr.go:128\ngithub.ibm.com/velox/apiconnect-operator/ibm-apiconnect/controllers/management.apiconnect/postgres/sftp-client.(*sftpClient).ListBackups\n\t/workspace/ibm-apiconnect/controllers/management.apiconnect/postgres/sftp-client/sftp-client.go:123\ngithub.ibm.com/velox/apiconnect-operator/ibm-apiconnect/controllers/management%2eapiconnect.(*ManagementClusterReconciler).reconcileHistoricalSFTPBackups\n\t/workspace/ibm-apiconnect/controllers/management.apiconnect/historic_backups.go:219\ngithub.ibm.com/velox/apiconnect-operator/ibm-apiconnect/controllers/management%2eapiconnect.(*ManagementClusterReconciler).reconcileHistoricalBackups\n\t/workspace/ibm-apiconnect/controllers/management.apiconnect/historic_backups.go:49\ngithub.ibm.com/velox/apiconnect-operator/ibm-apiconnect/controllers/management%2eapiconnect.(*ManagementClusterReconciler).Reconcile\n\t/workspace/ibm-apiconnect/controllers/management.apiconnect/managementcluster_controller.go:593\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/go/pkg/mod/sigs.k8s.io/
controller-runtime@v0.6.3/pkg/internal/controller/controller.go:244\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/go/pkg/mod/sigs.k8s.io/
controller-runtime@v0.6.3/pkg/internal/controller/controller.go:218\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/go/pkg/mod/sigs.k8s.io/
controller-runtime@v0.6.3/pkg/internal/controller/controller.go:197\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil.func1\n\t/go/pkg/mod/k8s.io/
apimachinery@v0.20.0/pkg/util/wait/wait.go:155\nk8s.io/apimachinery/pkg/util/wait.BackoffUntil\n\t/go/pkg/mod/k8s.io/
apimachinery@v0.20.0/pkg/util/wait/wait.go:156\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/go/pkg/mod/k8s.io/
apimachinery@v0.20.0/pkg/util/wait/wait.go:133\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/go/pkg/mod/k8s.io/
apimachinery@v0.20.0/pkg/util/wait/wait.go:90"}
{"level":"info","ts":1642459006.085661,"logger":"controllers.ManagementCluster","msg":"Reconcile management historical backups list","managementcluster":"np-apic/np-apic-mgmt"}
Is there some specific log entry I should be looking for or am I perhaps looking at the wrong operator?
------------------------------
Jennifer Stipe
------------------------------
Original Message:
Sent: Wed January 19, 2022 03:11 AM
From: Pierre Richelle
Subject: mgmt and apic subsystems in pending status forever after enabling Test and Monitor
Hello Jennifer,
I would first issue the command:
kubectl get apic
--> this would provide a complete view of the apic subsystem deployment and their status.
If the command help you to pin point where the issues are, you can then looks at the related pod logs.
Event hough the pods are running, they could logs useful information.
Last but not least, check the operator logs !!
------------------------------
Pierre Richelle
IBM Hybrid Cloud Integration Specialists
IBM
+32474681892
Original Message:
Sent: Tue January 18, 2022 12:11 PM
From: Jennifer Stipe
Subject: mgmt and apic subsystems in pending status forever after enabling Test and Monitor
Hello!
I am trying to enable the test and monitor capability on our API Connect 10.0.3.0-ifix1 running on Openshift v4.7. Everything works fine before I enable Test and Monitor. When I enable it, both the management cluster and API Connect clusters get stuck in Pending status and never go back to Ready status:
NAME READY STATUS VERSION RECONCILED VERSION AGE
np-apic-mgmt 0/0 Pending 10.0.3.0-ifix1 10.0.3.0-ifix1-351 68d
NAME READY STATUS VERSION RECONCILED VERSION AGE
apiconnectcluster.apiconnect.ibm.com/np-apic 6/7 Pending 10.0.3.0-ifix1 10.0.3.0-ifix1-351 68d
I have waited several hours before removing Test and Monitor, and both of those come up after that.
This is the yaml in np-apic-mgmt:
testAndMonitor:
enabled: true
hubEndpoint:
annotations:
certmanager.k8s.io/issuer: ingress-issuer
hosts:
- name: hub.{our stack host value is here}
secretName: hub-endpoint
turnstileEndpoint:
annotations:
certmanager.k8s.io/issuer: ingress-issuer
hosts:
- name: turnstile.{our stack host value is here}
secretName: turnstile-endpoint
I am using this link:
https://www.ibm.com/docs/en/api-connect/10.0.1.x?topic=configuration-installing-automated-api-behavior-testing-application
I do not see the hub or turnstile pods at all.
All the other pods in both the np-apic and ibm-common-services are up and there are no events on either the mgmt or apic clusters. I'd like to try troubleshooting this myself before opening a case but am not sure where to look for the issue. Has anyone had a similar issue?
I will say our worker nodes that are tagged for APIC/CP4i are overutilized, could this cause the issue? Usually when that happens I see a pod in Pending status with either a cpu or memory error, but I am not seeing anything obvious. If you have any ideas I'd really appreciate, thank you.
------------------------------
Jennifer Stipe
------------------------------