Hello Eduardo,
Until now, we didn't really observed this kind of behaviour during decision server console and decision server runtime startup.
But, we can implement a not so complex solution using the init-container mechanism. If you have a look at the deployment templates, you can see that we already use one in order to be sure that internal database is up and ready before to start all ODM containers.
initContainers:
{{- if and (empty .Values.externalCustomDatabase.datasourceRef) (empty .Values.externalDatabase.serverName) }}
- name: init-decisionserverruntime
{{ include "image.tagOrDigest" (dict "containerName" "dbserver" "containerTag" .Values.internalDatabase.tagOrDigest "root" .) | indent 8 }}
{{ include "odm-security-context" . | indent 8 }}
command: ['sh','-c', '{{ template "odm-sql-internal-db-check" . }}']
env:
{{ include "odm-sql-internal-db-check-env" . | indent 8 }}
resources:
{{ include "odm-sql-internal-db-check-resources" . | indent 10 }}
{{- end }}
using
{{- define "odm-sql-internal-db-check" -}}
until [ $CHECK_DB_SERVER -eq 0 ]; do echo {{ template "odm.dbserver.fullname" . }} on port 5432 state $CHECK_DB_SERVER; CHECK_DB_SERVER=$(psql -q -h {{ template "odm.dbserver.fullname" . }} -d $PGDATABASE -c "select 1" -p 5432 >/dev/null;echo $?); echo "Check $CHECK_DB_SERVER"; sleep 2; done;
{{- end -}}
So, if I well understand that you need a dependancy between decision server console and decision server runtime, you can use a similar mechanism, meaning add a new init-container in the decision server runtime template that will make a curl call loop on the internal decision server console service and exiting when the console is responding.
BR,
Mathias
------------------------------
Mathias Mouly
------------------------------
Original Message:
Sent: Mon December 27, 2021 09:00 AM
From: Eduardo Izquierdo Lázaro
Subject: ODM Helm Chart installation in Kubernetes
If you install ODM Helm Chart on Kubernetes, 2 deployments are created for the decision server console and decision server runtime. Then, cluster Scheduler takes those 2 deployments for execution in arbitrary order (maybe depending on the Kubernetes implementation). If the pods of decision server runtime are scheduled first, they're no registered in the console.
I've been looking for a solution but there is no an obvious way to establish a starting order. Microservices architecture is not designed to support pods registering in another pods, but pods that calls another pods and if it fails (for instance, pod not started yet) let the pod crash and let Kubernetes self-healing mechanism to start a new one. There're some solutions, but some are proprietary and other are very tricky.
The easy solution that I found for ODM is to edit DSR deployment and scale down to 0 the number of pods and then scale up again to the desired number of pods. This second time, if DSC is up, the runtimes are registered properly. This solution works fine but it's manual.
Note that if you're using a local cluster like Minikube or RedHat CRC, this happens every time your re-start the cluster (for instance after re-starting the laptop).
Is there a better way to solve this problem or any good practice around?
------------------------------
Eduardo Izquierdo Lázaro
Automation Architect
DECIDE
Madrid
609893677
------------------------------