Data Integration

 View Only
  • 1.  Doubt with CDC implemention

    Posted Wed March 16, 2022 02:49 PM
    Hello All,

    I'm new to CDC implemention and we are in plan to do POC for data replication from mainframe to distibuted side. We planned to a setup like VSAM->CDC(infosphere)->Kafka->mongoDB. I have more concern with Kafka to mongo connectitvity and bit worried with connection functionality though lot of connectors are available in market,
    .Incase if some one  done this setup before please share the doc or link . Also let me know know if you know any other easy or alter way to implemention this setup. Appreciate your help !!. 

    Regards,
    ambro jr

    ------------------------------
    ambros jr
    ------------------------------


    #DataIntegration
    #DataReplication


  • 2.  RE: Doubt with CDC implemention

    Posted Thu March 17, 2022 02:10 AM
    Edited by System Tue March 28, 2023 10:58 AM
    Hi

    Kafka is not supported as a source in CDC.

    This mean Kafka -> CDC -> MongoDB is not possible.

    The following are possible -
    VSAM -> CDC -> Kafka
    VSAM -> CDC -> MongoDB. See this workaround for MongoDB - https://ibm.ent.box.com/s/a7zsjwbwplajynn81rv284v5vorzlia1

    See all the supported sources and target - Supported source and targets
    Ibm remove preview
    Supported source and targets
    The first step in deploying a CDC Replication configuration solution is to establish your replication needs. You need to determine each source database for replication and its corresponding target for replication. A CDC Replication source engine captures changed data in your source database and sends source table changes to the target engine.
    View this on Ibm >




    ------------------------------
    HOW MING YONG
    ------------------------------



  • 3.  RE: Doubt with CDC implemention

    Posted Thu March 17, 2022 02:46 AM
    Hello

    The route you are choosing is my no means unusual. For a full implementation rather than a PCC you may need to look at your processes to consume the data in Kafka more closely, as you will need to think about ensuring consistency (e.g. processing the Kafka messages in the correct sequence (i.e. as per source transaction log) across multiple topics and partitions.

    CDC for Kafka offers a series of APIs know as the Transactionally Consistent Consumer which can be used in the customer processes consuming the target data. 

    As VSAM is not a relational database you will need to use Classic Data Architect to create a relational view of the VSAM data structures which CDC will use for the mappings.

    Your start point for the IBM CDC 11.4 documentation online is https://www.ibm.com/docs/en/idr/11.4.0 

    For a non-Kafka approach IBM Lab services people have defined a way of using MongoDB as a CDC target. This uses the CDC target FlexRep engine and a Java user exit - source at https://ibm.ent.box.com/s/ptki4zmlx17s3qane12ojhdoj3mg2uiw documentation at https://ibm.ent.box.com/s/a7zsjwbwplajynn81rv284v5vorzlia1

    There are some caveats with this approach:
    1) The code is only POC quality and would require expansion and refinement for a production purpose
    2) The user exit as is is not formally supported by IBM as it is essentially user code, and so even without any amendments should be owned and supported by the customer
    3) The sample implementation uses Derby as an underlying target FlexRep. I believe this is no longer included with CDC 11.4 in the current builds

    Hope this helps

    ------------------------------
    Robert Philo
    ------------------------------