IBM TechXchange Virtual WebSphere z/OS User Group

 View Only

Liberty z/OS Post #42- Displaying Information About Liberty z/OS Interrupted Requests

By David Follis posted Thu November 09, 2023 01:42 PM

  

This post is part of a series exploring the unique aspects and capabilities of WebSphere Liberty when running on z/OS.
We'll also explore considerations when moving from WebSphere traditional on z/OS to Liberty on z/OS.

The next post in the series is here.

To start at the beginning, follow this link to the first post.

---------------

The display interrupts command exists to give you some insight into what the request timing feature is doing.  If you’ve enabled that feature and configured some timeout values (or taken defaults) then it is busy monitoring dispatched requests and, possibly, timing them out. 

The display command will show you what is going on.  The feature allows you to configure timeouts for ‘slow’ and ‘hung’ requests.  So you need to decide (or determine experimentally) how long a normal request under normal conditions takes.  It is probably a range.  Then you can decide how much longer than that range you might consider slow.  If you’re of a statistical mind, you might observe a bunch of requests and declare ‘slow’ to be some number of standard deviations above the mean.  Or you could just eyeball it and decide what feels slow. 

Beyond ‘slow’ you can also determine at what point a request should be considered ‘hung’.  The idea to have in your head is that if a request takes this long it probably isn’t going to finish at all and has gotten stuck somehow.  Set it to a time beyond all reasonable expectation of completion (or some more standard deviations up from the mean, if you like). 

The display command can then show you a summary or detailed information about all the requests that are considered ‘hung’ (or ‘timed out’ in the display syntax, run together as one word, “timedout”). 

You can also get summary or detailed information about requests past a certain age.  No, this isn’t a dating app.  You aren’t looking for single requests over 25.  You’re specifying a number of seconds (that’s the age, in seconds) and you want information about requests that have been in the server for longer than that. 

So what good is all this?  I think it is interesting for automation, like the display work command.  You could have automation issue some variant of this command and, based on the output, make a decision about the state of the server.  A server with a single timed out request or one or two past some age might not be interesting.  Sometimes things are just slow.  But if suddenly the server has 40 requests that are taking too long, something has gone wrong (or is going wrong and maybe getting worse).  So perhaps it isn’t simply the number of requests that are slow, but the change since the last time you asked.  First there was one, now 10, next time you look, 45…that’s not good. 

Beyond this sort of ‘find all the requests that are…whatever’ information, you can also get information about a specific request if you know the request id.  That id comes back as part of the information from the other flavors.  So, if you have a request that is slow, you can parse out the request id and ask about it specifically later on to see how it is doing.  The request id also appears in the SMF 120-11 record (if you have those enabled) so you can get more information about CPU used etc. later on.  That’s the SM120BBV field.

0 comments
5 views

Permalink