WebSphere Application Server

JSR-352 (Java Batch) Post #32: Five Little Monkeys, Jumping on the Bed…. Skipping in Java Batch

By David Follis posted Wed March 06, 2019 09:47 AM

  
This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed

------

As a chunk step works through wherever the ItemReader is reading from, it might run into the occasional record that has something wrong with it.  Maybe account numbers are supposed to be numbers and it finds one with a letter.  Maybe the account holder’s last name field is blank (and it isn’t Madonna’s or Sting’s account).  Or some other thing is wrong with the record.  Should you fail the job right there?  Probably not.  One (or two, or three) bad records isn’t enough to stop the whole job which is possibly processing millions of records that are just fine.  You just want to skip this one and move along.

The first part of skipping records is coding the ItemReader to throw known exceptions when records are bad in a way that makes it one you would like to skip.  Your application can define classes that extend the generic Exception class and call it whatever you like.  Maybe NonNumericAccountNumber or MissingFamilyName or even just BadRecord with some specifics in the Exception class about what went wrong.  You should include information in the exception about the problem record (which record it was or other identifying information).  We’ll see why in a moment. 

Next you need to update the JSL for the step to define these exceptions as expected errors for which the record should be skipped.  Any time the itemReader throws an exception, the batch container will compare it to known skippable exceptions defined by inclusion in the skippable-exception-classes element in the JSL. 

When a skippable exception is thrown by the ItemReader, the batch container doesn’t call the ItemProcessor (since there is no item to process) but instead just calls the ItemReader again to read the next record.  The ItemReader needs to be smart enough (remembering which record it is on) that, having thrown an exception on the previous record that it knows is skippable, it will proceed to read the next record.  We’ll compare this to retryable exceptions later on. 

But before the reader gets called again, any SkipReadListener defined for the step will get called.  The SkipReadListener gets passed the exception that was thrown by the ItemReader.  If you remembered to include information about the bad record in the exception, the skip listener can log information about the bad record somewhere to get the record corrected.  Maybe it sends an email to the owner of the table data.

There are also SkipProcessListener and SkipWriteListener interfaces you can implement in case you throw an exception from the processor or writer that is defined as skippable in the JSL.

#Data Engineering
0 comments
7 views

Permalink