Data Replication

  • 1.  CDC V11.4 deployment and configuration

    Posted Tue December 15, 2020 11:22 AM
    Edited by System Tue March 28, 2023 10:58 AM
    Hello ,
    I'm orientating to CDC and have some questions needing clarification/advices

    We are planning the CDC v11.4 deployment to following way:

    - replication source is IBM Z/OS Db2 v12 - using CDC replication engine for Db2 remote source
    - replication target is PostgreSQL - using the CDC replication engine FlexRep .  Postgresql version need to be V9 or higher.

    - Access server (CDC server)  is considered to be Linux
    Q1: Engine for source is planned to be installed to Access server, as MF contain only stored procedure. Are there some considerations of this
    Q2: Do we need to install target engine to PostgreSQL DB server or can we install target replication engine also to Access server?
    -- like presentation  CDC Architecture page 13, scraper and apply on CDC server

    When configuring the target datastore - in our case replication engine (FlexRep) in CDC Access server and actual DB to connect in differet DB server (PostgreSQL). This should be possible
    - V11.4 manual states
    DataStores: Each datastore represents the database or target destination to which you want to connect
    Database/URL: Specifies the name of the database or the Universal Resource Locator (URL) of the database server you want to connect to.

    I assume we need the JDBC driver at target end to be used.  Or does we need the JDBC driver in CDC server connecting to target DB?

    The management console - we consider separate server for this, or is just workstation enough?
    Q3: is it possible to install several MC contacting to same Access Server?

    About High availability configuration
    - Mainframe we have sysplex implemented providing DVIPA address to connect
    - For Access server we consider cluster solution by using the hot-standby scenario
    - MC - no considerations as we are connecting to cluster (virtual) ip of access server

    About replication mode
    Our target does not contain any tables as idea is to set up selected source tables to target by same schema.
    - can we create the table to the target by using Refresh mode at initial startup or does it expect the tables are present?

    When configuring the subscription it expects target tables are existing.

    To create tables to target we need extract the DDLs from mainframe and run those at target. Then we will configure suvscription.

    I found the best practise: Best Practice - Deployment Configurations for LUW https://www.ibm.com/support/pages/ibm-data-replication-change-data-capture-cdc-best-practices#Best%20Practice%20-%20Deployment%20Configurations%20for%20LUW
    - is this information relevant for V11.4 deployment?

    Please advice and comment

    Best Regards
    Markku Niemi

    ------------------------------
    Markku Niemi
    TietoEvry
    Finland
    ------------------------------

    #DataReplication
    #DataIntegration


  • 2.  RE: CDC V11.4 deployment and configuration

    Posted Wed December 16, 2020 10:17 AM

    Hello

    On the CDC wiki
    https://www.ibm.com/support/pages/ibm-data-replication-community-wiki
    The most comprehensive deployment guide is linked:

    CDC Redbook

    The CDC Redbook covers most aspects of CDC, including deployment, HA environments, API, etc.


    ^ while this guide was written for older CDC version, it is still relevant for many of the questions you asked.

    Specifically:

    CDC zOS remote capture engine/agent is available with Linux x86 build and is installed remotely from zOS Db2 database, with the most important criteria being that it is "network near" the mainframe.  

    CDC FlexRep engine/agent is available with Linux x86 build and can be installed remotely from PostgreSQL database, with the most important criteria being that it is "network near" the database.

    CDC Access Server (AS) needs to be installed such that both engines, and the management console GUI can connect to it.   Typically the server would be in the same security zone as the CDC engines/agents.  It can be installed on the same server as CDC agent/engine.   

    Management Console is designed to be run on a Windows Desktop with TCP/IP network communication to the Access Server.   Network stability is important during active configuration of CDC subscriptions.   Loss of network connection during a configuration can corrupt CDC metadata and should be avoided.

    "Network Near" : The TCP/IP connection from CDC agents to their database should be stable, low latency (ping time) and avoid traversing firewalls or other network devices which can limit/constrain connections.  

    WAN Networks  : The TCP/IP connection from CDC agents to other CDC agents can be across long distance WAN, with preference for stable networks that avoid traversing firewalls or other network devices which can limit/constrain connections.  

    FlexRep requires local JDBC client driver installation, the driver is what CDC uses to communicate to the target database.

    ------------------------------
    Glenn Steffler
    ------------------------------



  • 3.  RE: CDC V11.4 deployment and configuration

    Posted Mon December 28, 2020 05:19 AM

    Hello,

    Thank you of valuable comments. 

    Any comments or considerations related to scenario where target DB resides in GoogleColud. GoogleCloudSQL is providing PostgreSQL. CDC provides FlexRep to be used as target engine for PostgreSQL



    ------------------------------
    Markku Niemi
    Technical specialist
    TietoEvry
    Espoo
    ------------------------------



  • 4.  RE: CDC V11.4 deployment and configuration

    Posted Tue January 05, 2021 04:21 PM
    https://www.ibm.com/support/knowledgecenter/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdcflexrep.doc/concepts/configuringcdc.html

    There are two options for targeting cloud environments like PostgreSQL on cloud.

    1.  Install CDC FlexRep target engine "network close" to the PostgreSQL instance on cloud.   This means configuring a Linux or Windows virtual environment in the same cloud data center / area as the PG Db.   This will give the CDC engine low network latency and good network stability to Pg.   Performance will be highest using this deployment.

    2.  Deploy CDC FlexRep remotely, ie "network remote" from Pg database.   CDC will have larger network latency to the database, which will lengthen the "round trip" JDBC connection statements issued, and will result in lower performance.  However, if the requirement is 1000 rows per second, this can be achieved.  

    There are CDC settings to help with network disconnections:
    https://www.ibm.com/support/knowledgecenter/SSTRGZ_11.4.0/com.ibm.cdcdoc.cdcflexrep.doc/tasks/maintainactivetcpconnections.html

    ------------------------------
    Glenn Steffler
    ------------------------------



  • 5.  RE: CDC V11.4 deployment and configuration

    Posted Wed January 13, 2021 03:45 AM
    Hello
    I'd like to clarify MC connections to Access server as I have understood we could use one MC with multiple Access servers.  As we are planning to have separate Access server for system test and acceptance test environments but only one MC. 

    We need to use unique port by Access server and when starting the MC we need specify the port to connect environment we like to access (system / acceptance test)

    Please advice.
    Best Regards




    ------------------------------
    Markku Niemi
    Technical specialist
    TietoEvry
    Espoo
    ------------------------------



  • 6.  RE: CDC V11.4 deployment and configuration

    Posted Wed January 13, 2021 04:09 AM
    Hello Markku

    When you log on to the Access Server from Management Console you specify the listener port and the server of the Access Server.

    The default listener port of the Access Server is 10101. However you can have multiple instances of Access Server on *nix simply by installing multiple times into different installation paths, but you must ensure that the second and subsequent installations of Access Server do not use the default listener port. Typically you would use 10102 for the second installation 10103 for the third installation and so on.

    So it is the combination of server name and listener port that uniquely identifies a given Access Server instance.

    The Management Console GUI remembers the hostnames and the associated ports that you have used in the past.

    Hope this helps

    Robert

    ------------------------------
    Robert Philo
    ------------------------------



  • 7.  RE: CDC V11.4 deployment and configuration

    Posted Mon July 19, 2021 10:38 AM
    Thank you for this!

    ------------------------------
    John Kelly
    ------------------------------