WebSphere Application Server & Liberty

 View Only

JSR-352 (Java Batch) Post #152: How-To: Passing Data Between Steps

By David Follis posted Thu September 02, 2021 08:58 AM

This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed

A common problem in multi-step jobs is figuring out how to pass data from one step to another.  In cases where one step does some processing on a file and a subsequent step does some processing on the results, then you are just passing files around.  File naming becomes the issue then.  Should they files be permanent files in which case you can name them something appropriate, or are they just temporary files as the job runs that you’ll throw away later?  If they are temporary, do can you put them somewhere separate from other instances of this job or somehow name them in a way that avoids collisions? 

A database is, of course, another good place to keep data between steps.  Whether the data is permanent or transient it is very reliable and might be necessary to allow the job to restart if it fails part way through.  The JobContext object has persistent state data that you can use for this purpose if it isn’t a whole table full of information you need to pass between steps.

Speaking of the JobContext, the job also has transient data that you can set.  If you don’t need the persistence, using the transient data methods can be pretty handy.  Of course the data goes away when the job ends (successfully or not). 

Another nice aspect of the transient data is that it doesn’t have to be serializable because it isn’t going to be written anywhere.  All you need is an Object so basically anything you need.  But there are some tricky bits to using it.

First of all, there is only one transient data object for the job.  If Step1 wants to use it to pass something to Step4, then Step2 can’t also use it to pass something to Step5.  If you plan to use it a lot then you need to create some aggregating Object that contains the objects for the various steps to use.  Not a big deal, but it requires some coordination across the job which may make it difficult to just click together different bits from different jobs to make a new one. 

Another tricky bit to use of the transient data is the possibility of multiple JobContext objects for the job.  If you use split/flows or partitions each thread can end up with its own JobContext and thus separate transient data. 

But in the simple case where the batchlet that is Step1 just needs to pass some stuff to the batchlet that is Step2, it is pretty easy.  And that’s the case our sample shows.  Our JSL file is JobContextTransient.xml and it just runs two batchlets one after the other.  The BatchletTransientSender is the first step that shows setting the transient user data to a simple string.  The BatchletTransientReceiver is the second step that shows fetching the string from the JobContext. 

The sample parts are here:  https://github.com/follisd/batch-samples