This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.To start at the beginning, follow the link to the first post.
The next post in the series is here
.This series is also available as a podcast on iTunes, Google Play, Stitcher, or use the link to the RSS feed.
Last time we talked about the in-memory Job Repository implementation offered by the WebSphere Liberty implementation of Java Batch. In memory is great and easy to set up (nothing to do at all). But for real jobs that you care about, you want a persistent store to keep that stuff.
The Liberty Batch implementation uses Java Persistence API (JPA) to access the database of your choice. JPA is pretty cool because it means that the batch support in Liberty doesn’t have to have code that addresses all the different quirky differences between databases. That’s all handled in the JPA layer.
It also means that WebSphere doesn’t have to supply DDL for you to use to set up the tables that will be used as the repository. Supplying different DDL for different database implementations is an endless source of problems. With JPA you can actually let the runtime create the tables itself on the fly. The first time the batch code tries to access a repository table, the JPA code will realize the tables aren’t there and go create them for you.
If having tables automatically created makes your skin crawl (or gives your DBA the willies) then you can also use a utility to just generate the appropriate DDL for your database. Then your DBA can give it a long gander before running it during an appropriate change window.
Recall that when you configure the server and want the in-memory repository you don’t have to do anything. If you want a persistent database used as the repository you have to tell the server where it is. That means setting up the batch persistence configuration element in server.xml to point to a datastore configuration.
Or you can just set up the default datastore configuration and the batch code will find it automatically and use it.
Using the in-memory repository will ultimately result in job information piling up in the heap. With a persistent repository job information piles up on disk somewhere. Eventually you will want to purge information about some of those jobs too. We’ll get to how purge operations work when we talk about the REST interface. That might be next…
For our song this week I think it should be “Memory” from CATS. When your job fails, you can go to the persistent repository and look at last week’s job that worked.. “I remember the time I knew what happiness was, let the memory live again.”