This guide walks through your first SDK workflow including authentication, creating a project, building a simple DataStage flow, running it, and regenerating code from a flow export.
Prerequisites
- A valid watsonx.data integration instance - Create your free instance here
- A valid API key for watsonx.data integration
- Python environment and the SDK installed
You can generate an API key in the platform. Reference the watsonx.data integration SDK documentation for API key and set up help.
1: Set API key
api_key = 'SET_API_KEY_HERE'
2: Authentication
from ibm_watsonx_data_integration.common.auth import IAMAuthenticator
from ibm_watsonx_data_integration import Platform
from ibm_watsonx_data_integration.services.datastage import *
auth = IAMAuthenticator(api_key=api_key)
platform = Platform(auth, base_api_url="https://api.ca-tor.dai.cloud.ibm.com")
3: Create a project
To create a new project, call Platform.create_project(). A name is required. Other fields such as description, tags, public and project_type are optional. If project_type is not provided it defaults to wx.
project = platform.create_project(
name='My first project',
description='Building sample batch flows',
tags=['flow_test_project'],
public=True,
project_type='wx'
)
project
Expected output
Project(guid='xxxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx', name='My first project')
4: Create a basic RowGenPeek flow
# Flow
flow = project.create_flow(
name="RowGenPeek",
environment=None,
flow_type="datastage"
)
# Stages
row_generator = flow.add_stage("Row Generator", "Row_Generator")
row_generator.configuration.runtime_column_propagation = False
peek = flow.add_stage("Peek", "Peek")
peek.configuration.runtime_column_propagation = False
# Graph
link_1 = row_generator.connect_output_to(peek)
link_1.name = "Link_1"
row_generator_schema = link_1.create_schema()
row_generator_schema.add_field("VARCHAR", "COLUMN_1").length(100)
project.update_flow(flow)
print("Flow created successfully:", flow.name)
5: Run a job from the flow
6: Regenerate the SDK code for an existing flow
This requires exporting the flow first. Documentation reference
7: Full working example script