WebSphere Application Server & Liberty

JSR-352 (Java Batch) Post #105: Cleaning Up After a Step

By David Follis posted Wed August 26, 2020 08:39 AM

This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed

Last time we talked about doing initialization processing for a step.  This week we’ll clean up after ourselves.  We’ll take the easy cases first. 

As with initialization, cleanup for a batchlet is simple.  Initialize when you get control and clean up before you return.  For a chunk step (or even a batchlet) you can also use the Step Listener to initialize and clean up in the before and after Step methods. 

As with initialization, where this gets tricky is in a partitioned chunk step.  There is no point of control for each partition that gets control after the partition is complete that runs on the partition thread.  There are before and after methods in the chunk listener, but it gets control after every chunk and there’s no way to know which one is the final chunk.  Or is there?

How does a chunk step know that it is done?  The partition map provides parameters to each partition that are usually used to let each partition know what it is supposed to do.  The common example is the start and end of a range of data this partition is to process (record numbers or primary-key values, etc).  The reader for each partition will have those partition values injected and will use them to find a starting point and read records until reaching whatever end value it was assigned.

Which might make you think about having something in a chunk listener that also got those values injected and somehow knew which record the reader was on so it would know where we are in the processing for this partition.  But it is actually easier than that.

When the reader reaches the end of its assigned range of things for this partition to do, it has to indicate to the batch container that it is done.  That’s handled by having the reader return a null instead of an object it read.  The batch container sees that null return value, skips the call to the processor, and completes the final chunk for this partition.  Could we somehow signal between the reader knowing it has run out of data and the after chunk listener?

Well, of course we could have some shared object that could do that.  But a simpler approach is to have a single class that implements the Read Listener and the Chunk Listener.  The Read Listener will get control after each read completes and can examine the returned object.  If it is null, the partition is done.  The listener can set an object attribute state flag.  When the after chunk method in the same class gets control, it can examine that flag and know that this is the last time through the loop and do whatever cleanup is required.