WebSphere Application Server & Liberty

WebSphere Application Server & Liberty

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only

Blog Post Hung thread detection in WebSphere Application Server

By joseph raj posted Fri April 26, 2013 05:10 PM

  

The hung thread detection policy in websphere application server lets you detect the hung threads by sending notifications in the log files. You can configure a hang detection policy to, accommodate your applications and environment so that potential hangs can be reported, providing earlier detection of failing servers. Using a hang thread detection policy you can specify a time that is too long for a unit of work to complete. A thread monitor will check all the threads in the systems.

Hang thread detection option is enabled by default. You can adjust the values of the detection policy or disable it by using the below procedure.

    • Navigate to Servers --> Applicaiton Servers --> server_name --> administration --> custom properties
    • Add the following 4 custom properties

      #

      Property

      Description

      Default

      A

      com.ibm.websphere.threadmonitor.interval

       How frequently thread monitor should check all the managed threads for hung threads

      180

      B

      com.ibm.websphere.threadmonitor.threshold

       After how many seconds a thread can be considered as hung

      600

      C

      com.ibm.websphere.threadmonitor.false.alarm.threshold

      The number of times that false alarms can occur before automatically increasing the threshold (T)

      100

      D

      com.ibm.websphere.threadmonitor.dump.java

       when set to 'true', creates a java core when a hung thread is detected

      false

        • Click OK and save the changes.
        • Sync the changes and restart the servers for the changes to take effect.

          False Alarms:

          What happens if a thread eventually finishes its work after it been reported as hung? This situation is called a 'false alarm'. A large number of these events indicate that the threshold value is too small. The hang detection facility can automatically respond to this situation. For every (T) false alarm, the threshold is increased by a factor of 1.5

          Notifications for J2EE applications:

          The notifications from the hand thread detector will appear in the log file.

          Hung thread report

           WSVR0605W

          Thread ‘thread_name’ has been active for ‘xxx sec’ and may be hung.  There are ‘N’ threads in total in the server that may be hung.

          False Alarm

           WSVR0606W

          Thread ‘thread_name’ was previously reported to be hung but has completed. It was active for approximately ‘yyy sec’. There are ‘N’ threads in total in the server that still may be hung.

          Auto adjustment

           WSVR0607W

          Too many thread hangs have been falsely reported.  The hang threshold is now being set to ‘zzz sec’.

          0 comments
          6 views

          Permalink