In the last post, I provided an easy introduction to scalability issues to go into various specific scalability issues in later posts. In this post, I will talk about a real-world example of a Java lock contention that may be one cause of scalability problems.
Recently, I was working on scalability issues of a Java Standard Edition application. The application is fairly complex. Here are some of the characteristics of this application:
- The application can be launched multiple times concurrently. So, multiple JVMs of this application can be processing work at the same time.
- In each JVM, the application creates multiple thread pools; each pool performs a specific transaction processing.
- Once the application is launched, it stays up until it is manually terminated. In a way, the application acts like an application server, although not Java EE compliant.
The application is running into a number of scalability issues. In this post, I will talk about one issue related to Java lock contentions. Locking is not specific to Java. If there are a group of threads that need to make a change to a resource, there will typically be a need to synchronize the access to that resource. If access to that resource is not synchronized, resource integrity becomes a problem.
For a Java application, Java thread dumps or Java core files can show if lock contention is happening. For the scalability issue I was working on, we collected a series of thread dumps during a test that showed the application was not scaling. How often the thread dumps should be collected depends on the problem you are trying to solve. To look into whether we have a Java lock contention issue, one thread dump can be adequate. However, we collected more thread dumps to see if lock contention is a brief occurrence or is happening all the time during a test. To collect a Java core file or a Java thread dump, you issue kill -3 <PID> where <PID> is to be replaced with the Java process ID which can be obtained using the options rich ps command. Both kill and ps commands are available on many flavors of Linux and Unix.
For IBM Java, there is a lot of useful information printed in the Java thread dump, but I will only talk about the Java locking information, just enough to tell us whether we have a lock contention issue that requires further attention. After the thread dump is collected, open it with a text editor. The following is a clear indication that some lock contention is happening:
3XMTHREADINFO "thread-1" J9VMThread:0x0000000030ABCD00, omrthread_t:0x000001002732A228, java/lang/Thread:0x000000008025FCA8, state:B, prio=5
3XMCPUTIME CPU usage total: 0.904382000 secs, user: 0.627142000 secs, system: 0.277240000 secs, current category="Application"
3XMTHREADBLOCK Blocked on: lock Owned by: "thread-2" (J9VMThread:0x0000000030B76800, java/lang/Thread:0x000000008027DBF8)
The lock is a reference to a Java object. The thread thread-2 currently owns the lock and thread-1 wants to use lock. Note that thread-1, thread-2 and lock are in italics as they are only a place holder in this example.
Using the thread dump, we spotted that a Java lock contention is happening. But, how do we know if this is causing a scalability issue? How do we know if the lock is owned and released quickly? If the lock is owned and released quickly, the scalability issues are likely due to some other causes. If the following applies to the application, then the lock contention is likely a cause of the application scalability issues:
- There is so much work done while the lock is owned by a thread
- There are many threads waiting for that same lock. The more threads you have waiting, the longer they will wait if they all need to do the same amount of work.
- If each thread prints the time it owns the lock, the longer the time, the worse is the lock contention, especially if there are many waiting threads.
In the next post, I will continue the lock contention topic.