This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.To start at the beginning, follow the link to the first post.
The next post in the series is here
.This series is also available as a podcast on iTunes, Google Play, Stitcher, or use the link to the RSS feed.
The Job Repository is an interesting aspect of the JSR-352 specification in that it is a critical piece of the ability to process jobs, but it is barely mentioned by the spec. The specification basically declares that you have to have one, but the details are outside the scope of the specification itself.
The purpose of the Job Repository is to remember things about the jobs you have run in the past and the current state of jobs running right now. The batch and exit status of every step of every job you have ever run is kept in the repository so you can go back and see what happened. The fact that some job execution failed has to be remembered so that you can restart it. The results of each step executed as part of that failed job has to be remembered so that restart processing can do its thing.
Jobs that are running right now keep their status up to date in the repository in case of a failure. This is especially important for checkpoint data that might be needed in the event of a failure and restart or even a retry-rollback scenario within execution of the step. For partitioned steps, the state of each partition (and information about how many partitions there are) have to be remembered.
But the specification leaves it up to the individual implementations to decide how to keep track of all this, in a transactional way. This likely means some sort of traditional database will be underneath. But the spec nicely leaves the way open for future developments of technology that meet the requirements in some new way.
Unfortunately, as a side-effect of not specifying how the Job Repository is implemented, the spec is also silent on how you get rid of things in it. It is just great that the repository remembers all the details about that job you ran two weeks ago. But eventually, unless some government regulation requires you to keep this sort of thing forever, you will likely cease to care about details of job executions from years and years ago.
Eventually the space required to store all of this information might become a problem. And you will want to remove some things from the repository (if just to make the results returned by APIs looking for information about a particular job you run every day a bit smaller). And the specification doesn’t say. So…as part of evaluating how a particular implementation of JSR-352 might work for you, consider how they manage the repository and how you get things removed from it.
Our song is obviously “Try to Remember” which was originally sung by Jerry Orbach (yeah, the guy from Law and Order). You can find it online..have a listen.