WebSphere Application Server & Liberty

JSR-352 (Java Batch) Post #98: Exploiting a Split to Stop a Chunk

By David Follis posted Wed July 08, 2020 08:13 AM

This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed

The intent of this post is to just explore an idea I had that might help solve a very specific problem with certain batch jobs.  Suppose you have a job with a chunk step.  For whatever reason, your ItemReader can sometimes get ‘stuck’.  It reaches out to wherever it gets data from and doesn’t come back.  You would like to be able to stop the job when this happens. 

The stop operation does two things:  it marks the job as stopping in the Job Repository, it drives the stop method in a batchlet if it is what is currently running in the job.  For a chunk step nothing gets control.  The assumption is that the read/process execution loop will come around to a check on the job status, notice the job is stopping, and stop without continuing the loop.

But in our scenario, that isn’t going to happen because we are stuck in the ItemReader trying to read an item.  But suppose we have a handle or something to the connection we are using to read data.  And suppose there is some sort of ‘cancel’ operation we can call on that connection to try to terminate an outstanding request.  That would be nice, but the chunk won’t get control to issue it.  There is no stop method in a chunk like there is in a batchlet. 

Suppose that we wrap our chunk step in a flow.  And suppose that flow is part of a split with a second flow.  And the second flow consists of a batchlet.  The batchlet doesn’t do anything.  It is just there for its stop method to be driven in the event a stop is issued.  That stop method could then cancel the hung read operation (if there is one).

For this to work, you would need to share whatever the connection handle thing is that is used by the reader in one flow with the batchlet in the other flow.  You would also probably want some flags and locking so the batchlet’s stop method can tell if it is supposed to try to cancel an operation or not.

You also need a way for the chunk step to communicate to the batchlet that it has finished and the batchlet can exit. 

Is this a recommended pattern or some other official thing?  Nope.  Just me pondering how you might shake loose a stuck chunk step.  Other thoughts or ideas?