Thanks Eric for your reply. Do you have any documentation for the same. I will try to connect with IBM folks also.
Original Message:
Sent: Tue April 28, 2026 10:48 AM
From: Eric Greisdorf
Subject: Data Volume Validation
Hi Rahul,
An excellent option is to use the Control Hub API Processor to make the JobRunner REST calls ex: /jobrunner/rest/v1/metrics/job/{jobId}
This will return the record counts and timings for each Pipeline stage.
Then use a Lookup Processor / Executor to count the expected records, or inspect an audit trail.
For more details specific to your environment and constraints, I recommend reaching out to your IBM account team.
Regards,
------------------------------
Eric Greisdorf
Original Message:
Sent: Mon April 27, 2026 02:33 PM
From: Rahul Dharmawat
Subject: Data Volume Validation
Thanks a lot Eric for your reply.
Really appreciate if you can give more detailed option to explore Data volume validation. We have KAFKA topic(Through Batch) to load the data and there is no specific window for the Kafka topic.
Windows aggregator is not the correct option for us. Groovy Evaluator is also not working due to env. setup.
Do you have some more material for Pipeline uses standard stages (ex: Control Hub API or JDBC Query) to retrieve and compare record counts and data, and a Stream Selector stage to route/alert accordingly.
Really appreciate your help.
------------------------------
Rahul Dharmawat
Original Message:
Sent: Mon April 27, 2026 11:40 AM
From: Eric Greisdorf
Subject: Data Volume Validation
Hi Rahul,
Yes, absolutely, these are common scenarios for IBM StreamSets.
- Data volume validation - Can be accomplished several ways, depending on the requirements. a. Pipeline uses standard stages (ex: Control Hub API or JDBC Query) to retrieve and compare record counts and data, and a Stream Selector stage to route/alert accordingly. b. REST API to retrieve Job metrics. c. Python SDK to retrieve Job metrics. Would you like more information on a specific approach?
- Pipeline execution alerts - We've posted this video training to the community library for setting up rules, alerts and subscriptions. A 'Pipeline Commit' subscription is used for this Github repository integration example .
Let us know if these get you started, and anything we can help with.
Best Regards,
------------------------------
Eric Greisdorf
Original Message:
Sent: Sun April 26, 2026 08:07 PM
From: Rahul Dharmawat
Subject: Data Volume Validation
Hello IBM Team,
We would like to confirm whether the following scenarios can be implemented using IBM StreamSets:
Data volume validation
Is it possible to validate that the amount of data processed by Streamset ? We have KAFKA Topic with Batch.
Pipeline execution alerts
Is there a way to configure automatic alerts (e.g., email notifications) when a pipeline has not been executed or fails to run as expected?
We would appreciate your guidance or best practices related to these scenarios.
Best regards,
Rahul
------------------------------
Rahul Dharmawat
------------------------------