WebSphere Application Server & Liberty

 View Only

JSR-352 (Java Batch) Post #166: Kubernetes Jobs – One Pod, One Record?

By David Follis posted Thu January 13, 2022 08:56 AM

This post is part of a series delving into the details of the JSR-352 (Java Batch) specification. Each post examines a very specific part of the specification and looks at how it works and how you might use it in a real batch application.

To start at the beginning, follow the link to the first post.

The next post in the series is here.

This series is also available as a podcast on iTunesGoogle PlayStitcher, or use the link to the RSS feed

Last week we talked about letting Kubernetes Jobs default to just one pod with one success.  This time we’ll look at making use of the spec.completions configuration to deliberately run the ‘job’ more than once.

Ok, so we aren’t really running the job more than once.  We’re running the command that is specified in the YAML more than once.  We use the completions configuration to tell Kubernetes how many successful completions we need in order for the job to finish.

Say what?  I have to run the job multiple times successfully for the job to succeed?  No, just the command.  The set of successful commands is the job.  Think of each execution of the command as one pass through a loop.  Think of this Kubernetes Job as a single chunk step job from Jakarta Batch.

The pod will get spun up and run the command you specify which does one instance of the thing you want done.  Maybe it reads a single record from an input file and process it.  If it works, the command completes successfully.  Then we go around again.  Until we’ve processed spec.completions number of records successfully.

That’s a bit odd too.  You’d need to know how many records you had to process up front.  And if you specify that number as the number of successful completions then you must successfully process them all.  A failure won’t count so you’d have to try again with that record to finish. 

And it isn’t an “at least” count.  If you’ve got 1000 records and figure 950 successes is enough, it will get to 950 and stop whether there are still 50 records to go or not.  Of course, if you’re just working off some backlog that would be ok.  Every time this Job is run, we process N records.  As long as there are least N records to process (probably more if you allow for failures) then it would probably be ok. 

I think this feels a little quirky because I’m trying to force this into what I think of as a ‘job’.  Maybe you wouldn’t use this to process records.  Maybe it just does something and after it has done that successfully 5 times (or whatever) that’s good enough.  Maybe your job has a list of 100 people it can notify about something, and it needs to successfully notify 10 of them.  Each execution of the command to do the notification gets a name from a list and gives it a try.  Once you’ve successfully notified 10 people you’re done.  You don’t care which ten you got.  Something like that perhaps.

Regardless, it is an interesting model.