WebSphere Application Server & Liberty

JSR-352 (Java Batch) Post #30: Go (or Checkpoint) Your Own Way

By David Follis posted Wed February 20, 2019 08:03 AM

This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

Apologies to Fleetwood Mac….

Last time we talked about configuring an item or a time based checkpoint interval.  If neither of those work for you then perhaps writing your own checkpoint algorithm is the answer.  Just specify a checkpoint-algorithm element that points to your Java class that implements the CheckpointAlgorithm interface and you’re off. 

The CheckpointAlgorithm allows you to get control at some interesting points in chunk processing and gives you a couple of options for controlling the checkpoint interval.

First of all, there are the beginCheckpoint and endCheckpoint methods which get control around the processing for an individual chunk.  It gives you a chance to get control before and after the checkpoint.  The endCheckpoint method gets control after the commit happens so you are outside of the transaction.  The beginCheckpoint is, likewise, before the next transaction starts so you are outside of the chunk transaction here too.

To control deciding when to checkpoint, the interface includes a method called isReadyToCheckpoint. This gets control after each pass through read/process of an item and returns a Boolean. Basically, this gives you a chance, after each item, to decide if you want to checkpoint now or not.  Since this gets control on every pass through the loop, don’t do anything slow or expensive because the cost will add up. And remember you are inside the transaction so if you do touch a transactional resource in this method it is included in the transaction. 

Finally, there is the checkpointTimeout method. This gets control before every chunk starts and allows you to set a time interval, in seconds, for this chunk. This is just like setting the time-limit in the JSL, except that you can choose a different value for each chunk.

So how would you use this?  The most likely approach is to count the items (when you get control in isReadyToCheckpoint) and watch the time between passes through read/process and adjust the count or the time when you checkpoint based on how things are going to achieve whatever your goal is.

The tricky part isn’t implementing an algorithm to achieve your ideal checkpoint interval.  The tricky part is figuring what your ideal checkpoint interval should be….