Informix

 View Only
  • 1.  Informix ER -

    Posted Wed May 10, 2023 11:26 AM

    Hi,

    In our organization, we have several informix ids server (~30) and databases (~250)  with the same schema on informix IDS server (version 14) all around our country. We would like to build some kind of Data Lake using informix ER and the pattern Primary-Target Replication System - Data Consolidation.


    Here is a simplified example :

    •        Server IDS 1
      o   DataBase db1
           Table customer
               Column id (Primary Key)
               Column name
               …
           Table adress
              Column id (PK)
              Column description
              ….
      o   DataBase db2
           Table customer
              Column id (Primary Key)
              Column name
              ...
          Table adress
             Column id (PK)
             Column description
             ….
    •        Server IDS 2
      with db3 and db4 etc..

    And the schema of our data lake :

    -          Server IDS idsLake

    o   Database dataLake

    §  Table customer

    ·         Column id (PK)

    ·         Column dataBase (PK)

    ·         Column name

    ·         ….

    §  Table Adress

    ·         Column id (PK)

    ·         Column dataBase (PK)

    ·         Column description

    ·         …..

    For each table, the primary key is composed with ID and Database to hanble duplicated keys from the different database.

    The command to replicate without filling the database column (executed in the  Server IDS idsLake) is :

    cdr define repl -M g_er -c g_er -C always repl _ind -S row -k ID -f y -D y -T "P db1@ids1:informix.customer" "select name, description from customer" "R dataLake@idsLake:informix.customer" "select name, description from customer";

    To fill the column, I hoped i could do something like this

    cdr define repl -M g_er -c g_er -C always repl _ind -S row -k ID -f y -D y -T "P db1@ids1:informix.customer" "select name, description from customer" "R dataLake@idsLake:informix.customer" "select name, description, 'ids1' as database from customer";

    But it does not work.

    Question : how can i know the replication source to fill the column database ?

    PS : I know I can handle the database origin with asynchronous post trigger with jsonspl routine but i prefer to avoid it for this use case (because I have to code the replication et manage it by myself)

    Thanks by advance



    ------------------------------
    thibaut vandewalle
    ------------------------------


  • 2.  RE: Informix ER -

    Posted Wed May 10, 2023 01:11 PM

    Hi Thibaut,

    If I understand well your target schema (idslake), the customer table will have the primary key on the "id" and "database" columns, but the source schema will have only the primary key on the "id" column for the same table.

    Technically,Enterprise Replication is based on a replication key to replicate data and ensure data consistency/coherence between the different informix instances.

    All tables that are replicated must have a replication key that is composed of one or more columns that uniquely identifies each row. The replication key must be the same on all servers that participate in the replicate.

    If you do not want to add the "database" column and change the primary key definition on each informix instance, you should use the ERKEY shadow columns in place of a primary key.


    -- Franck Thomas ConsultiX franck.thomas@consult-ix.fr http://www.consult-ix.fr Téléphone : 33 (0) 1 39 12 18 00 Mobile    : 33 (0) 6 78 81 09 33
    Le 10/05/2023 à 17:25, thibaut vandewalle via IBM TechXchange Community a écrit :
    010001880644d4ae-8b37ef7a-bc44-4f8f-8a87-62357f36ad6e-000000@email.amazonses.com">
    Hi,   In our organization, we have several informix ids server (~30) and databases (~250)  with the same schema on informix IDS server (version...
    Server IDS idsLake

    o   Database dataLake

    §  Table customer

    ·         Column id (PK)

    ·         Column dataBase (PK)

    ·         Column name

    ·         ....

    §  Table Adress

    ·         Column id (PK)

    ·         Column dataBase (PK)

    ·         Column description

    ·         .....

    For each table, the primary key is composed with ID and Database to hanble duplicated keys from the different database.

     

     

    The command to replicate without filling the database column (executed in the  Server IDS idsLake) is :

    cdr define repl -M g_er -c g_er -C always repl _ind -S row -k ID -f y -D y -T "P db1@ids1:informix.customer" "select name, description from customer" "R dataLake@idsLake:informix.customer" "select name, description from customer";

     

    To fill the column, I hoped i could do something like this

    cdr define repl -M g_er -c g_er -C always repl _ind -S row -k ID -f y -D y -T "P db1@ids1:informix.customer" "select name, description from customer" "R dataLake@idsLake:informix.customer" "select name, description, 'ids1' as database from customer";

    But it does not work.

     Question : how can i know the replication source to fill the column database ?

     

    PS : I know I can handle the database origin with asynchronous post trigger with jsonspl routine but i prefer to avoid it for this use case (because I have to code the replication et manage it by myself)

    Thanks by advance



    ------------------------------
    thibaut vandewalle
    ------------------------------
      Reply to Group Online   View Thread   Recommend   Forward   Flag as Inappropriate  


     
    You are subscribed to "Informix" as franck.thomas@consult-ix.fr. To change your subscriptions, go to My Subscriptions. To unsubscribe from this community discussion, go to Unsubscribe.