WebSphere Application Server & Liberty

JSR-352 (Java Batch) Post #104: Initialization Processing for Batch Artifacts

By David Follis posted Wed August 19, 2020 08:01 AM

This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed

The ability to inject properties from the JSL into batch artifacts makes it possible to avoid hard-coding values and can let you dynamically change how a job behaves without changing the code.  A problem I run into occasionally is trying to inject a value that I want to use as part of initialization for the artifact. 

The problem is that property injection doesn’t happen until after the constructor for the artifact runs.  For the reader and writer that’s ok.  Open processing is as good a place to do initialization as the constructor.  You just have to be careful not to do it again if you are handling any kind of retry/rollback processing where open would get called again.

But what about other artifacts like a batchlet, or the processor, or any of the listeners?  For a batchlet it doesn’t really matter.  Just do everything at once when it gets called.  But the processor and most of the listeners can get called a lot.  Let’s look at the processor.

Besides the constructor there is just one method that gets control over and over.  You could, of course, just have a Boolean called initialized that starts as false and lets you do your initialization the first time through the processItem method.  That works except that you’re going to check that Boolean every time through the read/process loop and only do initialization once.  That’s millions of times (potentially) that you’re going to do that pointless check and those little bits of CPU usage add up. 

Another approach might be to do the initialization in the beforeStep method of a Step listener.  You could hang the results in a Thread Local and access it from the processor with confidence.  Unless your step is partitioned in which case each partition will have a separate Thread and thus a separate Thread Local. 

There’s no batch artifact that gets control just once at the start on each partition thread.  With a partitioned step you’re pretty much stuck checking some Boolean every time through the loop. 

You might be tempted to use class statics to anchor things, but remember that more than one copy of a job from the application could be running concurrently in the server at the same time.  And it wouldn’t help anyway if your environment is configured to run partitions in separate servers (using the Liberty partition dispatcher/executor model). 

Now that we have things initialized, what about cleanup?  That’s for next time.