Co-authors: James Cho, Katie Le
In this article, we will understand how to create an IBM Data Synchronization job from scratch and run it.
What is IBM Data Synchronization Community Edition Tool?
IBM Data Synchronization Community Edition is a free-to-use, single node Docker container, high performance data integration tool which is easy-to-use, that supports real-time and offline data synchronization of massive data out of the box. It combines the features of Apache SeaTunnel and IBM Carbon UI to create a perfect data integration tool for the users. It is a standout solution for enterprises or users seeking to navigate the complexities of modern data management. IBM Data Synchronization Community Edition comes out as a one-step solution for your data integration and data synchronization needs.
NOTE: Currently, IBM Data Synchronization Community Edition is supported on Windows and MacOs platforms only.
Prerequisites
To use add IBM Data Synchronization Community Edition, you need to pull the docker image of IBM Data Synchronization from DockerHub. Complete step by step tutorial for doing that is available here.
How to create an IBM Data Synchronization job using command line?
From the host machine:
Make sure the configuration file exists inside the container, if not, use docker cp or place the config file in the shared mount volume between the host and the container and run the following command:
docker exec -it bash -c '$SEATUNNEL_HOME/bin/seatunnel.sh --config /path/to/config/file' ibmdatasynchronization
where $SEATUNNEL_HOME
is /opt/seatunnel
From inside the container:
- Run the following command to get an interactive bash shell in the container:
docker exec -it ibmdatasynchronization bash
$SEATUNNEL_HOME/bin/seatunnel.sh --config /path/to/config/file
where $SEATUNNEL_HOME
is /opt/seatunnel
How to create an IBM Data Synchronization job using UI?
Step 1: Access the UI using the following link: http://<host_ip>:8801
where <host_ip>
is the IP address of the host machine.
The login credentials to access the UI are:
Username: admin
Password: admin
Step 2: Navigating the home page
On the Home Screen, you can see various tabs on the left panel, which provides different functionalities. The first step that we need to do is add datasources.
Step 3: Add datasources by clicking on Datasource tab on the left panel. You can choose from a wide list of connectors.
Once you select a connector, you can add the details like name, URL, driver details and so on.
Step 4: Next is to create an IBM Data Synchronization task. For that, click on "Task Synchronization on the left panel. Click on "Create Synchronization Task" and enter the details.
This will open up a canvas where we can select data sources from the left panel by simply dragging and dropping to the canvas.
We can also choose from various transformation options by the same drag-and-drop mechanism.
Click on save after the creation of task is done.
Step 5: Run and monitor the task
You can see the status of task as executed and finished successfully.
Conclusion
Using IBM Data Synchronization is incredibly user-friendly and intuitive. Whether you are a data engineer, analyst, or business user, IBM Data Synchronization allows you to achieve your desired results easily and efficiently.
Further read
Links to-
Dockerhub
Community blog