Cloud Pak for Data

 View Only

 New to CloudPak for Data DataStage

Shahid Mahmud's profile image
Shahid Mahmud posted Wed May 21, 2025 01:16 PM

Hi experts,

Can you please advise on the following?

  1. What if I want to trap the error during a job execution and not let the job fail because of it? The trap and interruption will allow some other method/routines or even flows when the error took place, including sending emails to POCs.
  2. We are currently an SAP Data Services shop. There is a script object that can contain Java-type scripting, including sending SQL commands to the source or target or any other table for that matter. Potentially, a Data Services workflow/job can contain a single script object which can contain very complex scripts, including SQL. Is that possible?
  3. Our production system runs a daily batch and weekly, monthly, or on-demand jobs as needed. Data Services offers a management console where you can see job execution status/history. You can start a job or stop a job, see logs, row count, and any errors/warnings. You can also view a the job design via a drill-down mechanism to view a job or its components without going to edit mode. Does CP4D DataStage offer anything like that? A batch or job management console where you can start/abort a job, see the status, logs, row counts, etc.
  4. Does IBM CloudPak for Data DataStage allows a join like CUSTOMER_NUMBER = CUST_NUM, where the data element is the same but named differently?

I will appreciate it.

Thanks.

Shahid Mahmud

Ralf Martin's profile image
Ralf Martin IBM Champion

Hi Shahid,

again my answers from the other group

1: In DataStage on flowlevel, you can use a Message Handler to demote a warning to an information, also you can decide how many warnings in a run to tolerate before aborting the run of a flow, on pipeline level you can decide what to do after a activity that has finished ok, with warning or error.

2: I am not sure if I understand your question correctly, but in a pipeline (CP4D, not DaaS (DataStage as a Service, hosted by IBM), you are not allowed to execute java or C code in DaaS) you can run a "Run Bash script" node, which should allow you to run about anything that is runable on your server, maybe there is even an easier way here, this is not really my topic. In DataStage flows there is also Java Integration stage, that lets you "invoke Java classes from Datastage flows", but I have never used it, you can check the information for this stage about what it is able to do. If you need to access SAP, then there are connectors to do so directly, I used those 15 years ago to access SAP R/3 with DataStage.

3: Yes, in legacy you have the DataStage Director and the Operations Console, which together work very well (according to my colleagues that do operations), while in CP4D there is now the Job Dashboard, which in my opinion does the job, but currently (for what we need) not as well as in legacy, IBM is currently working on bringing the Job Dashboard to the same functionallity level as we are used to in legacy. You can see the row counts for every link in a flow, both in the design view (for the current run) or in the logs (for every run where you still have the logs).

4: Currently when using the join stage, the columns have to be named (it is easy to rename the columns in your flow before the join) the same, this comes from the underlying operator which has a -key KeyColumn attribute, when using the lookup stage (which also supports between or Greater than istead of onlöy equal conditions) the column names can be different. There is alos the Merge stage which I have never used outside of training, as far as I remember it also needs column names which are equal.

Short answer on this, the keycolumnnames for a join can be different in your source, but for using the joinstage you have to rename them (well at least one of them) in your flow to make it work.