WebSphere Application Server & Liberty

 View Only

Jakarta Batch Post 125: Add Methods to ItemProcessor

By David Follis posted Wed February 17, 2021 07:59 AM

  
This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed
-----

The issue can be found here.

The reader and writer have open and close methods as well as a method to gather checkpoint data before committing the chunk transaction.  By contrast, the ItemProcessor just has one method, processItem.  That method only gets control when the reader has returned data to process.  This issue proposes some new methods for the interface that would let the processor get control a bit more often.

The first proposal is the addition of a close method to allow the processor to flush whatever state it might have.  The original design of the processor didn’t involve it having any state.  It just gets control to handle the data provided by the reader.  But in practice it is clearly possible for the processor to accumulate information such as a count of certain types of items it has processed.  The processor might not exist to give any data to the writer, but just to count up items read that match some criteria.  As defined there’s no easy way for the processor to do anything with the final count because it won’t get control when the reader is done reading (because the final read returns a null which means the processor won’t get called). 

The second proposal is to create an opportunity for the processor to get control at each checkpoint.  It isn’t clear from the text of the issue whether this would be an opportunity for the processor to provide checkpoint data or just a chance to get control at the end of each chunk.  It seems to me that if we think the processor might be keeping a running count of things, it would want to checkpoint that information along with checkpoint information from the reader and writer.  Of course a count is just a simple example, there are certainly more complex things a processor could do that would result in some state that would need to be persisted and aligned with the persisted checkpoint data from the reader and writer.

The issue doesn’t mention adding an open method to the processor, but it seems clear to me that if we’re going to add the other two, we should add this one also.  It certainly seems reasonable that a processor might need to establish a connection (maybe to a rules engine or a database) to enable it to do whatever processing it is doing.  There are other ways to do this, of course.  We’ve talked about some of them in earlier posts.  But adding a method to match the ones in the reader and writer would make it much easier.

Would the open and close methods for the processor be included in the transaction that is wrapped around those methods for the reader and writer?  Probably, but there’s an open issue about that also.  We’ll consider that one next week.

Meanwhile, feel free to join in the discussion on this issue at the link above.

0 comments
8 views

Permalink