Open Source Databases

 View Only
  • 1.  Export & Import Operator

    Posted Fri February 05, 2021 06:11 PM

    Hi Team,

    We are using Streams Export & Import operators extensively.which were implemented in C++ as mentioned in Documentation.

    Is export/import operators use TCP IP functionality internally...?

    If yes

    Q1. If a single job exporting the data to more than 3 jobs. Let's assume First job exported 10 records to downstream jobs.

    Each job will have its own set of data copy. Is TCP port also creates the data copy for each down stream job and keeps it in TCP buffer..?

    However the congestion policy works once the connection is established and restarted the down Stream job then there will not be any data loss. How does exporting job make sure that there will not be any data loss after a restart of down Stream job.

    Q2 Job 1 exporting data which is having 30 columns, connecting to 3 down stream jobs.

    Ex: Down Stream job one need columns from 1 to 20.

    Down Stream job two need columns from 21 to 28

    Down Stream job three need columns from 29 to 30

    In below mentioned approaches which is advisable...?

    Approach 1 :

    Job 1 is having a single export Operator and connecting to all 3 down streams job. After import it will filter the columns.

    Approach 2 :

    Job 1 will create the three output streams with expected columns

    O/p stream 1 : 20 columns

    O/p stream 2 : 8 columns

    O/p stream 3 : 2 columns

    and have 3 export operators which will connect to each individual jobs.

    Thanks.






    #OpenSourceOfferings
    #Streams
    #Support
    #SupportMigration


  • 2.  RE: Export & Import Operator
    Best Answer

    Posted Fri February 05, 2021 10:28 PM

    Hi Nagesh,

    Export/Import operators do use TCP/IP to communicate data.

    Q1: The Export operator will have a socket connection for each Import operator and will perform a write to each socket using the same buffer.

    The Export operator does not make any guarantees about data loss. If an Import operator connection is not present (either because it's restarting or hasn't been brought up yet) then it will miss out on that data.

    Regarding congestion policy, you're correct that the congestion policy is only performed once a connection to an Import operator is established. Depending on that policy, the Export operator may drop the connection from the Import operator if it's not consuming data fast enough (with the 'dropConnection' policy or it will wait on for connection to accept more data. Again, if the Import operator connection gets disconnected for whatever reason then the Export will move on to the next connection.

    Q2: Either approach will work and each have their own pros and cons. However, I think Approach 2 would be more beneficial since you'll only be sending required data rather than extra data that's going to be filtered out immediately anyway.






    #OpenSourceOfferings
    #Streams
    #Support
    #SupportMigration


  • 3.  RE: Export & Import Operator
    Best Answer

    Posted Mon February 08, 2021 06:51 AM

    Hi ,

    Thanks for the reply. got good picture about internal functionalities about the export & Import operators.

    Do we have any documentation about these things other than

    Operator Export (ibm.com)..?

    Thanks

    Nagesh






    #OpenSourceOfferings
    #Streams
    #Support
    #SupportMigration


  • 4.  RE: Export & Import Operator
    Best Answer

    Posted Mon February 08, 2021 02:46 PM

    This page here: https://www.ibm.com/support/knowledgecenter/en/SSCRJU_4.2.1/com.ibm.streams.ref.doc/doc/dynamicappcomposition.html


    may provide some more insight about Import and Export but I have not been able to find an exact page that covers our Q&A.






    #OpenSourceOfferings
    #Streams
    #Support
    #SupportMigration