Db2

 View Only
Expand all | Collapse all

DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

  • 1.  DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

    Posted Tue March 28, 2023 09:21 AM

    Hi .

    After setting up a HADR 2 node Cluster with pacemaker, I try to create a qdevice using the following command:

    /opt/ibm/db2/V11.5/bin/db2cm -create -qdevice red1

    in this case, red1 is the redhat linux machine wich has also the db2 pacemaker packages installed. After a while the command is running, I get the following error message:

    Error: Could not create qdevice via corosync-qdevice-net-certutil

    Firewalls on all nodes disabled.

    any ideas?

    regards 

    Joerg



    ------------------------------
    Jörg Burdorf
    ------------------------------


  • 2.  RE: DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

    Posted Tue March 28, 2023 02:41 PM

    Is passwordless ssh enabled between both db nodes and quorum device?

    create qdevice connects to both db nodes and generate certificate which gets shared with quorum node for qdevice to work.



    ------------------------------
    Sumit Choudhary
    ------------------------------



  • 3.  RE: DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

    Posted Wed March 29, 2023 03:09 AM

    The logs of db2cm are located in /tmp.
    Hopefully the corresponding log can give more information on why it does not work.



    ------------------------------
    Hans-Juergen Zeltwanger
    ------------------------------



  • 4.  RE: DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

    Posted Thu March 30, 2023 11:03 AM

    HI folks!

    Now it is working. The hint with the logfiles in tmp folder help to see the error. The passwordless ssh connection was only from the 2 db2 node to the quorum node but not from the quorum node to the 2 db2 nodes. Also there was a directory created in :/etc/corosync/qdevice/net/nssdb  on the 2 db2 nodes. When I start the create command again, this message comes up in the logfiles:

    Node sles02 seems to be already initialized. Please delete /etc/corosync/qdevice/net/nssdb

    After deleting this directoriy on the 2 db2 nodes the command runs with success!

    Successfully configured qdevice on nodes sles02 and sles01
    Attempting to start qdevice on red1
    Quorum device red1 added successfully.

    I am very happy!

    Thanks a lot for your help!!!!

    Joerg



    ------------------------------
    Jörg Burdorf
    ------------------------------



  • 5.  RE: DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

    Posted Thu March 30, 2023 11:18 AM
    Edited by Hans-Juergen Zeltwanger Fri March 31, 2023 06:57 AM

    Hi Jörg,
    I am glad it worked now.
    I also experienced this situation in the past, with a half-configured qdevice and no way forth and back. In this case the only solution seems to be to move (or delete) the corresponding config files.

    On Qdevice host:
    /etc/corosync/qnetd # mv nssdb nssdb.old

    On DB hosts:
    /etc/corosync/qdevice # mv net net.old



    ------------------------------
    Hans-Juergen Zeltwanger
    ------------------------------



  • 6.  RE: DB2 HADR Cluster using Pacemaker with Corosync tiebreaker

    Posted Thu October 12, 2023 12:54 PM

    Experienced the same error, as follows, when ran but what I saw in the db2cm logs was different"

    db2cm -create -qdevice <qdevice_host>

    Error: Could not create qdevice via corosync-qdevice-net-certutil

    In the db2cm logs:

    Start corosync-qdevice-tool -s
    corosync-qdevice-tool: Can't connect to QDevice socket (is QDevice running?): Connection refused
    End - Failed

    ...

    Start ssh ip-10-0-35-162 "test -f /etc/corosync/qnetd/nssdb/cluster-<cluster_name>.crt"
     
    End - Failed

    ...

    Start /usr/sbin/corosync-qdevice-net-certutil -Q -n <cluster_name> <qdevice_host> <db2_node1_host> <db2_node2_host>
    root@<db2_node1_host>: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
    root@<db2_node1_host>: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).
    Node <db2_node1_host> doesn't have /usr/sbin/corosync-qdevice-net-certutil installed
    End - Failed

    With the last error messages, verified that "/usr/sbin/corosync-qdevice-net-certutil" is on <db2_node1_host>.  Tested passwordless ssh for root between the 2 DB2 nodes and Quorum device.  Specified "PreferredAuthentications publickey" in /etc/ssh/ssh_config.

    Any ideas on what to try to resolve this issue?

    Ronnie Ng



    ------------------------------
    Ronnie Ng
    ------------------------------



  • 7.  RE: DB2 HADR Cluster using Pacemaker with Corosync tiebreaker