From what I understand, ISVA uses permanent connections. Which means that when you establish a connection with a remote server, unless it is specifically closed, ISVA will keep it.
I'm not sure if you can execute netstat -an on ISVA. But if you could, you would be able to see all connections. Another alternative would be to generate a tcpdump and check what is going on. But I guess, from what you describe, you may already tried this one!
I also understand that the CPU peak is also a symptom of this behavior on ISVA. There are limited options on ISVA configuration to set the keepalive parameter to disable permanent connections. For example, you may set this parameter for LDAP connections, but not sure if you can set the same for other type of external connections your ISVA may have.
But I would focus on this issue, because from the symptoms you describe permanent connections might be the issue.
Nevertheless, there are other reasons for that to happen. Here are a few:
- Connection Leaks
- Threads in deadlock situations
But I don't think we have the capability of looking into these issues! I guess only IBM support would be able to check these possibilities.
------------------------------
Joao Goncalves
Pyxis, Lda.
------------------------------
Original Message:
Sent: Tue April 27, 2021 11:55 AM
From: Rajkumar Godi
Subject: webseald spiking CPU on appliance and stops processing requests
Thanks for your inputs Jon, Yes we do have an on-going support case however I just wanted have the benefit of this forum to seek some inputs as this
appears to be a unique issue of its kind, and I hate to say we have been debugging this from almost 10 months now :) We did collect tons of data but no clues so far.
The fact that the spiking instance does not auto-recover makes it more complex to arrive at any obvious conclusions.
------------------------------
Rajkumar
Original Message:
Sent: Tue April 27, 2021 03:43 AM
From: Jon Harry
Subject: webseald spiking CPU on appliance and stops processing requests
Hi Raj,
If I had to make a wild guess, I would say the issue is most likely related to some connection that has gone stale over an extended period of inactivity. This might not be the connection to the backend servers - it could be connection to a directory or database or other component in the architecture. Sometimes you can get locks when a firewall enforces a connection timeout which is different from what the connection endpoints expect... they think the connection is valid but the firewall is dropping packets. Usually these situations are recovered automatically but perhaps something else is going on in your specific case.
I would imagine you'll need to open a support case to get to the bottom of this. If this issue happens each Monday, it should be possible to gather some debug or stats or something which can help point in the right direction.
Jon.
------------------------------
Jon Harry
Consulting IT Security Specialist
IBM
Original Message:
Sent: Tue April 27, 2021 12:39 AM
From: Rajkumar Godi
Subject: webseald spiking CPU on appliance and stops processing requests
Rather unusual but has a very clear pattern.
On ISAM 9.0.7.2..Every Monday at the start of the peak traffic, one or two of our webseals (out of 10 replicated webseals) gradually starts hitting 90 to 99% CPU and never recovers until the appliance is hard rebooted. Upon investigation we found that webseald process was consuming high CPU utilization however we fail to identify the cause for it. we do have a bunch of slow backends with high response times and large request size, for which we defined per-junction thread limits...And, any other misconfiguration with webseal config would cause the issue to occur on all days of the week instead of just Monday morning.
Is it a capacity issue-- no, the same capacity works fine rest of the week...and the volume of traffic is same all days...issue is only on Monday morning and If Monday happens to be a holiday, then the issue would occur on Tuesday ..so some backend running some batch jobs/crons at that time? how do I find that rogue backend that is most likely tanking the CPU, and it could be some other reason. What other reasons I can think of for an issue with this pattern of occurrence. Before thinking of the likely root cause, please remember again the issue occurs only on Monday morning! Welcome all your inputs!!
Thank you!
-Raj.
------------------------------
Rajkumar
------------------------------