Hi Shahid
1: In DataStage on flowlevel, you can use a Message Handler to demote a warning to an information, also you can decide how many warnings in a run to tolerate before aborting the run of a flow, on pipeline level you can decide what to do after a activity that has finished ok, with warning or error.
2: I am not sure if I understand your question correctly, but in a pipeline (CP4D, not DaaS (DataStage as a Service, hosted by IBM), you are not allowed to execute java or C code in DaaS) you can run a "Run Bash script" node, which should allow you to run about anything that is runable on your server, maybe there is even an easier way here, this is not really my topic. In DataStage flows there is also Java Integration stage, that lets you "invoke Java classes from Datastage flows", but I have never used it, you can check the information for this stage about what it is able to do. If you need to access SAP, then there are connectors to do so directly, I used those 15 years ago to access SAP R/3 with DataStage.
3: Yes, in legacy you have the DataStage Director and the Operations Console, which together work very well (according to my colleagues that do operations), while in CP4D there is now the Job Dashboard, which in my opinion does the job, but currently (for what we need) not as well as in legacy, IBM is currently working on bringing the Job Dashboard to the same functionallity level as we are used to in legacy.
------------------------------
Ralf Martin
Principal Consultant
infologistix GmbH
Bregenz
------------------------------
Original Message:
Sent: Tue May 20, 2025 01:38 PM
From: Shahid Mahmud
Subject: Evaluating DataStage for my organization
Hi Ralf, so very kind of you for responding to my questions and in necessary details. You are very knowledgeable. Would you be able to address the following?
- What if I want to trap the error during a job execution and not let the job fail because of it? The trap and interruption will allow some other method/routines or even flows when the error took place, including sending emails to POCs.
- We are currently an SAP Data Services shop. There is a script object that can contain Java-type scripting, including sending SQL commands to the source or target or any other table for that matter. Potentially, a Data Services workflow/job can contain a single script object which can contain very complex scripts, including SQL. Is that possible?
- Our production system runs a daily batch and weekly, monthly, or on-demand jobs as needed. Data Services offers a management console where you can see job execution status/history. You can start a job or stop a job, see logs, row count, and any errors/warnings. You can also view a the job design via a drill-down mechanism to view a job or its components without going to edit mode. Does DataStage offer anything like that?
------------------------------
Shahid Mahmud
Original Message:
Sent: Sun May 18, 2025 07:52 AM
From: Ralf Martin
Subject: Evaluating DataStage for my organization
Hi Shahid,
my answers are based on the new CP4D DataStage, not on Legacy 11.7
- Is there a looping mechanism in DataStage whereby a dataflow and other objects can be executed until a looping condition is not met?: Yes both in a Watson Pipeline and in a DataStage flow.
- Is there a native DataStage scripting language that includes routine ETL functions? Yes, in a DataStage flow in the Transformer Stage, also in Watson Pipeline expressions
- Is there a DataStage documentation more from general administration, job-scheduling, and execution perspective? Is there a console available to just see job-execution logs and related information? Yes there is a lot and good documentation, with CPDCTL you can execute comands in the comandline, to e.g. run jobs or get status from objects in the repository.
- Is there a method to export a job/flow and then reimport into the same or a different project? Yes
- To what extent the DataStage metadata accessible for reporting or copying to another location for analysis? Is that a Data Governance function, maybe? I don't have much knowledge on this subject, but the data is stored in a database, where you SHOULD be able to access it, in CP4D this SHOULD be easier compared to legacy XMETA.
- Is there a way execute a job in debugging mode and assess the data at each breakpoint? They introduced something like this in legacy, but I never used it (afaik it ist not there anymore in CP4D), normal way to solve this, is to add a peek stage somewhere to write (selected) data to the log. I don't know how a debugger should work properly with a flow that is a little bit more complex with several sources joined.
- Is there a way to lock an asset/object for editing by one developer and then put it back into a shared repository for some other developer to pick up? Is there a checkout and a check in option? Does an object show 'in use' so someone else does not pick up and work it at the same time? In legacy if an object was opend by one developer, then it could not be opend for edit by someone else. In CP4D if two persons open a flow and one saves it, then the other person gets informed and the following options, Saves as, Ignore, reload, but there is currently no locking planed afaik. In CP4D there is a GIT integration.
- If there are multiple targets possibly referencing the same table with multiple streams going in, does DataStage allows an update-order? Not as far as I know, but you can build something in a flow, making the reject link of e.g. your insert the reference link of a dummy lookup before your update. That way, all rows to your insert go first and the update goes later. Or you write your insert and updates to datasets and have flows that write the data to the tables in sequential order in the pipeline.
- Is there a way to see/view a flow or job on a view-only screen with a drilldown to see the individual transform details, without needing to log in the edit mode? I am not sure what you want here, if you don't want to see a flow or pipeline in the designer, then especially for flows, you can also see the run metrics in the job log.
- Does DataStage allow building a custom transform? Yes, you can (using C++) build user defined functions which can be used in a expression, Buildop stages which are quite easy to implement (altough there is not that much which you can do here that is not possible in a transformer stage and CustomOp stages which are very flexible (but it is afaik not well documented how to implement them).
I hope this helps
------------------------------
Ralf Martin
Principal Consultant
infologistix GmbH
Bregenz
Original Message:
Sent: Wed May 14, 2025 11:32 AM
From: Shahid Mahmud
Subject: Evaluating DataStage for my organization
Can someone please help with the following questions? Answer whichever you know.
- Is there a looping mechanism in DataStage whereby a dataflow and other objects can be executed until a looping condition is not met?
- Is there a native DataStage scripting language that includes routine ETL functions?
- Is there a DataStage documentation more from general administration, job-scheduling, and execution perspective? Is there a console available to just see job-execution logs and related information?
- Is there a method to export a job/flow and then reimport into the same or a different project?
- To what extent the DataStage metadata accessible for reporting or copying to another location for analysis? Is that a Data Governance function, maybe?
- Is there a way execute a job in debugging mode and assess the data at each breakpoint?
- Is there a way to lock an asset/object for editing by one developer and then put it back into a shared repository for some other developer to pick up? Is there a checkout and a check in option? Does an object show 'in use' so someone else does not pick up and work it at the same time?
- If there are multiple targets possibly referencing the same table with multiple streams going in, does DataStage allows an update-order?
- Is there a way to see/view a flow or job on a view-only screen with a drilldown to see the individual transform details, without needing to log in the edit mode?
- Does DataStage allow building a custom transform?
Shahid Mahmud
------------------------------
Shahid Mahmud
------------------------------