Here are the logs on the restart...the Repos do exist but on start are not found.
+ SIGNALS_TRAPPED=(INT HUP QUIT TERM STOP)
+ declare -a SIGNALS_TRAPPED
+ BKP_FILE=xmeta-services.tar.gz
+ BKP_STORE=/opt/IBM/InformationServer/xmeta-bkp
+ datadir=/home/db2inst1/Repos
+ installdir=/opt/IBM/InformationServer
+ '[' -d /home/db2inst1/Repos/xmeta ']'
+ echo 'Extracting repos dedicated volume data'
Extracting repos dedicated volume data
+ /opt/IBM/InformationServer/initScripts/initReposVolumeData.sh
+ BKP_FILE=xmeta-services.tar.gz
+ BKP_STORE=/opt/IBM/InformationServer/xmeta-bkp
+ DATA_DIRS=Repos
+ DATA_DIR_LIST=($DATA_DIRS)
+ declare -a DATA_DIR_LIST
+ installdir=/home/db2inst1
+ DEDICATED_VOL_MOUNT=/mnt/dedicated_vol/Repository/is-xmetadocker
+ datamountdir=/mnt/dedicated_vol/Repository/is-xmetadocker
+ sudo chown -R db2inst1:db2iadm1 /mnt/dedicated_vol/Repository/is-xmetadocker/Repos
+ [[ -d /mnt/dedicated_vol/Repository/is-xmetadocker/Repos/xmeta ]]
/mnt/dedicated_vol/Repository/is-xmetadocker already exists
Symbolic link to Repos does not exist
+ echo '/mnt/dedicated_vol/Repository/is-xmetadocker already exists'
+ newdb=false
+ '[' '!' -L /home/db2inst1/Repos ']'
+ echo 'Symbolic link to Repos does not exist'
+ sudo /bin/rmdir /home/db2inst1/Repos
+ sudo /bin/rm /home/db2inst1/Repos
/bin/rm: cannot remove '/home/db2inst1/Repos': No such file or directory
+ ln -s /mnt/dedicated_vol/Repository/is-xmetadocker/Repos /home/db2inst1/Repos
+ sudo chown -R db2inst1:db2iadm1 /mnt/dedicated_vol/Repository/is-xmetadocker/Repos
+ sudo chmod -R 755 /mnt/dedicated_vol/Repository/is-xmetadocker/Repos
+ sudo su - db2inst1 -c '/bin/rm -f /home/db2inst1/sqllib/.ftok; /home/db2inst1/sqllib/bin/db2ftok'
+ sudo su - db2inst1 -c '. sqllib/db2profile; db2start'
05/15/2020 15:44:04 0 0 SQL1063N DB2START processing was successful.
SQL1063N DB2START processing was successful.
+ sudo su - db2inst1 -c '. sqllib/db2profile; db2 catalog database XMETA on /home/db2inst1/Repos/xmeta; db2 catalog database IADB on /home/db2inst1/Repos/iadb; db2 catalog database DSODB on /home/db2inst1/Repos/dsodb'
SQL6028N Catalog database failed because database "XMETA" was not found in
the local database directory.
SQL6028N Catalog database failed because database "IADB" was not found in the
local database directory.
SQL6028N Catalog database failed because database "DSODB" was not found in
the local database directory.
+ sudo su - db2inst1 -c '. sqllib/db2profile; db2 activate database xmeta; db2 activate db iadb; db2 activate db dsodb'
SQL1013N The database alias name or database name "XMETA" could not be found.
SQLSTATE=42705
SQL1013N The database alias name or database name "IADB" could not be found.
SQLSTATE=42705
SQL1013N The database alias name or database name "DSODB" could not be found.
SQLSTATE=42705
+ sudo su - db2inst1 -c '. /home/db2inst1/sqllib/db2profile; db2 connect to xmeta; db2 -tvf /opt/IBM/InformationServer/initScripts/updateNodeCert.sql; db2 commit; db2 connect reset'
SQL1013N The database alias name or database name "XMETA" could not be found.
SQLSTATE=42705
UPDATE XMETA.REGISTRATION_CONFIGPROPERTY SET VALUE_XMETA='{isnode2048:isf}MIIDIzCCAgugAwIBAgIEILF2WjANBgkqhkiG9w0BAQ0FADBCMQswCQYDVQQGEwJVUzEMMAoGA1UEChMDSUJNMRcwFQYDVQQLEw5Tb2Z0d2FyZSBHcm91cDEMMAoGA1UEAxMDSUlTMB4XDTE5MDgyMzIzMDYzMFoXDTQ3MDEwNzIzMDYzMFowQjELMAkGA1UEBhMCVVMxDDAKBgNVBAoTA0lCTTEXMBUGA1UECxMOU29mdHdhcmUgR3JvdXAxDDAKBgNVBAMTA0lJUzCCASIwDQYJKoZIhvcNAQEBBQADggEPADCCAQoCggEBAIIeUolpzIK5cq2HPEnumMbe5DlZVN24aeJi34sFB7w1FaaIhlR1uKuwaU/j+VQONRAQzWUjno2xgB28FIdwzFfFr40q4OyVx/KZZ8WxXegR5VB4tS0/ZhMthIHoJrn/j54xnsFLfLZOONXeEgoP2uhroUAsokyOZdA8A6RzMveTBoJrKczbBca5zx7PzFnLulmMgJ8C7wzfVBeNnAS/el/jxrIJTJuALVjP7e3rDmxh/pUydbzcbVlZqI5GPfbaEgE++5rJLBWZDT7Cty5QH3u9LJ9fuGLWmygi6W5mNPYmgiMjx2Zr1DLfX5yp9YpU/wMSmifs6tDT7sy0utQG7DMCAwEAAaMhMB8wHQYDVR0OBBYEFBdOnUF1fu5UCYDHpot17GXqU2TxMA0GCSqGSIb3DQEBDQUAA4IBAQAf6AD7dPhqT8DA9UIdemDP2eah9KJac1Fa/y3EpEG9APNbr0lnbiezDcbJSx9jYGAPKcko/7ErfMhqwf6snCg4DWKoh3ypGh5/fQtsWPEVdo5lmc+pI+TvdTRYJH82EocpS4+cnq6CLTz2vTf9LkYFWrqJMSvyN2Qsv+7Crs0OZzNeHsXJ7sS6vBHgIGQV6wYem2h5uq36O0OywVgXb/654XsW7cRaqh1zog529KTAZ0DKwxcOiHaVKfsXIxgngmw7Ti+2LB5UAlFOT0icldgPlZP5Wm0jhCXt0D4BawqjnyPyxrVR0VSS18b7J8AqIw5p8hqmDwwz3wO8x061jSgQ' WHERE NAME_XMETA='NODE_CERTIFICATE'
DB21034E The command was processed as an SQL statement because it was not a
valid Command Line Processor command. During SQL processing it returned:
SQL1024N A database connection does not exist. SQLSTATE=08003
SQL1024N A database connection does not exist. SQLSTATE=08003
SQL1024N A database connection does not exist. SQLSTATE=08003
+ wait 1569
+ tail -f /dev/null
------------------------------
Karen Paciocco
------------------------------
Original Message:
Sent: Fri May 15, 2020 09:42 AM
From: TOMASZ HANUSIAK
Subject: Existing ICP 3.2.1 Cluster CP4D 2.1.0.2 (x86_64.520) E200512 Unable to resolve RPC Address
Hi,
What I would try is to restart the Xmeta pod. It has an init scripts inside re-creating the databases.
Thanks
------------------------------
TOMASZ HANUSIAK
Original Message:
Sent: Fri May 15, 2020 09:18 AM
From: Karen Paciocco
Subject: Existing ICP 3.2.1 Cluster CP4D 2.1.0.2 (x86_64.520) E200512 Unable to resolve RPC Address
Thank you for your help!
I do have an open support case and am waiting for instructions. I checked the Xmeta pod and the the db directory was empty but I was able to create a sample db. Below are my pods after the failed installation. I can attempt the installation again but waiting to hear from Support. If you would like to see any logs, please let me know. Thank you again.
------------------------------
Karen Paciocco
Original Message:
Sent: Thu May 14, 2020 10:36 AM
From: TOMASZ HANUSIAK
Subject: Existing ICP 3.2.1 Cluster CP4D 2.1.0.2 (x86_64.520) E200512 Unable to resolve RPC Address
Hi,
There used to be some issues with the Xmeta pod. Did you check the xmeta pod itself ?
You should be able to exec into it, then sudo - db2inst1 and do `db2 list db directory` and finally try to connect to one of the db.
Can you open a support case for it? It may need some deeper troubleshooting/webex.
Thanks
------------------------------
TOMASZ HANUSIAK
Original Message:
Sent: Thu May 14, 2020 10:22 AM
From: Karen Paciocco
Subject: Existing ICP 3.2.1 Cluster CP4D 2.1.0.2 (x86_64.520) E200512 Unable to resolve RPC Address
We were able to get past this issue and the installation fails at:
2020-05-13 20:53:16 UTC - Running command: //var/lib/icp/icp-data/InstallPackage/components/dpctl --config //var/lib/icp/icp-data/Ins tallPackage/components/install.yaml helm waitChartReady -r icp-data-ibm-iisee-zen100 -t 60
time="2020-05-13T16:53:19-04:00" level=info msg="No daemonsets under this release. Please continue with the rest of the process."
time="2020-05-14T02:23:28-04:00" level=fatal msg="Failed to get deployments due to: Unauthorized"
2020-05-14 06:23:28 UTC - Installation failed for //var/lib/icp/icp-data/InstallPackage/components/../modules/ibm-iisee-zen:1.0.0
Looking at the icp-data-ibm-iisee-zen100-ibm-iisee-zen-gov... crashed pods it appears to be because of the failed db2 connection. Here is a sample log.
[Thu May 14 13:15:36 UTC 2020] Starting database migration tool...
Flyway Community Edition 5.2.4 by Boxfuse
ERROR:
Unable to obtain connection from database (jdbc:db2://is-xmetadocker:50000/xmeta) for user 'xmeta': [jcc][t4][2057][11264][4.23.42] The application server rejected establishment of the connection.
An attempt was made to access a database, xmeta, which was either not found or does not support transactions. ERRORCODE=-4499, SQLSTATE=08004
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL State : 08004
Error Code : -4499
Message : [jcc][t4][2057][11264][4.23.42] The application server rejected establishment of the connection.
An attempt was made to access a database, xmeta, which was either not found or does not support transactions. ERRORCODE=-4499, SQLSTATE=08004
Are you aware of this issue? Thank you for all your assistance!
------------------------------
Karen Paciocco
Original Message:
Sent: Wed May 13, 2020 04:45 PM
From: TOMASZ HANUSIAK
Subject: Existing ICP 3.2.1 Cluster CP4D 2.1.0.2 (x86_64.520) E200512 Unable to resolve RPC Address
Hi,
The missing ClusterIP is ok, so we can ignore this.
Regarding the pods, can you check if the zen-metastoredb-init job exists and was successful ?
kubectl get job | grep meta
kubectl describe job zen-metastoredb-init
Thanks
------------------------------
TOMASZ HANUSIAK
Original Message:
Sent: Tue May 12, 2020 12:26 PM
From: Karen Paciocco
Subject: Existing ICP 3.2.1 Cluster CP4D 2.1.0.2 (x86_64.520) E200512 Unable to resolve RPC Address
We have tried installing CP4D 2.1.0.2 (x86_64.520) into our existing ICP3.2.1 cluster and the installation failed. The zen-metastoredb pods are in CrashLoopBackOff state:
I200512 16:02:52.741778 1 cli/start.go:923 CockroachDB CCL v2.0.6 (x86_64-unknown-linux-gnu, built 2018/10/01 13:59:40, go1.10)
I200512 16:02:53.002743 1 server/config.go:430 system total memory: 1.0 GiB
I200512 16:02:53.003014 1 server/config.go:432 server configuration:
max offset 500000000
cache size 256 MiB
SQL memory pool size 256 MiB
scan interval 10m0s
scan max idle time 200ms
event log enabled true
I200512 16:02:53.003092 1 cli/start.go:789 using local environment variables: COCKROACH_CHANNEL=kubernetes-helm
I200512 16:02:53.003133 1 cli/start.go:796 process identity: uid 0 euid 0 gid 0 egid 0
I200512 16:02:53.003162 1 cli/start.go:461 starting cockroach node
E200512 16:02:53.017597 1 cli/error.go:112 failed to start server: unable to resolve RPC address "zen-metastoredb-0.zen-metastoredb.icp-data.svc.cluster.local:26257": lookup zen-metastoredb-0.zen-metastoredb.icp-data.svc.cluster.local: no such host
Error: failed to start server: unable to resolve RPC address "zen-metastoredb-0.zen-metastoredb.icp-data.svc.cluster.local:26257": lookup zen-metastoredb-0.zen-metastoredb.icp-data.svc.cluster.local: no such host
Failed running "start"
The ports are open:
# ss -tunlp|grep 26257|wc -l
0
# ss -tunlp|grep 8080|wc -l
0
I notice that zen-metasoredb-public service has a cluster-ip but zen-metasoredb does not have a status:
cloudant-svc ClusterIP 10.4.29.78 <none> 80/TCP,443/TCP 20h
dsx-influxdb ClusterIP 10.4.23.27 <none> 8086/TCP 20h
redis-svc ClusterIP 10.4.25.60 <none> 26379/TCP 20h
usermgmt-svc ClusterIP 10.4.22.166 <none> 8080/TCP 20h
utils-api-svc ClusterIP 10.4.26.171 <none> 8080/TCP 20h
zen-metastoredb ClusterIP None <none> 26257/TCP,8080/TCP 20h
zen-metastoredb-public ClusterIP 10.4.23.216 <none> 26257/TCP,8080/TCP 20h
Please let me know if any information or steps that can be shared to resolve this issue.
Thanks,
Karen
------------------------------
Karen Paciocco
------------------------------
#CloudPakforDataGroup