Tips for working with deadlocks in Db2 for z/OS (v13)
Tip 1 Understand what a deadlock is
A deadlock occurs when two or more application processes each hold locks on resources the others need, and neither can proceed. After a preset time interval (the value of the DEADLOCK TIME) Db2 will roll back one of the processes so the others can continue.
Why it matters: If you don't recognise that a given SQLCODE -911 or -913 is due to a deadlock you might target the wrong fix.
Tip 2 Detecting deadlocks
• When a deadlock is detected, Db2 writes a trace record (IFCID 0172) if statistics trace class 3 is active.
• The SQLERRD (3) field of the SQLCA contains reason code 00C90088 for a deadlock.
• The SQLCODE returned will typically be -911 (if rollback succeeded) or -913 (if rollback failed) in many environments.
Tip: Make sure your monitoring/tracing is set up so you can catch IFCID 0172 or these reason codes, that helps you confirm you had a deadlock rather than just long wait/timeout.
Tip 3 Distinguish deadlock vs timeout vs suspension
• Suspension: a process waits because it requests a lock held by another process. It resumes when the lock is released or when timeout/deadlock occurs.
• Timeout: a process waits too long (exceeds a preset interval) and is terminated by Db2. You'll see SQLCODE -911 or -913 with reason code 00C9008E.
• Deadlock: mutual blocking situation, resolved by Db2 choosing a process to roll back.
Tip: When diagnosing lock issues, identify whether it's suspension (simple wait), timeout (wait > limit), or deadlock (cycle). The remedial actions differ.
Tip 4 Tune the DEADLOCK TIME parameter
The DEADLOCK TIME value controls how long Db2 waits before deciding something is a deadlock and taking action.
Tip: If you find many deadlocks, you might experiment with reducing the DEADLOCK TIME (so Db2 resolves quicker) but do so cautiously, as reducing too much might lead to unnecessary rollbacks. Conversely, if you rarely see true deadlocks but many timeouts, increasing the time might help for legitimate longer-running transactions.
Tip 5 Analyse your transaction and locking patterns
• Look for transaction sequences where process A locks resource X then tries to get Y, while process B holds Y then tries to get X (classic deadlock).
• Avoid scenarios where many transactions lock in differing orders on shared resources.
Tip: Enforce a consistent locking order (e.g., always access tables in the same sequence) to reduce the risk of deadlock cycles.
Tip 6 Shorten the duration of locks
The longer a lock is held, the higher the risk of deadlock/suspension.
Tip:
• Commit frequently (to release locks).
• Keep transactions short and do only the necessary work under lock.
• Avoid long-running operations within the same unit of work if they hold exclusive locks. By reducing the time locks are held you reduce the window in which other transactions might intervene and raise contention.
Tip 7 Use appropriate isolation levels and lock semantics
Locking behaviour depends on isolation levels, lock scopes (table, page, row) and whether you're using SHARE, EXCLUSIVE, etc.
Tip: If feasible, use less-restrictive lock types (e.g., row locks instead of table locks) or shorter lock durations. Also review if certain operations can use NOWAIT or SKIP LOCKED semantics (if supported) to avoid blocking altogether.
Tip 8 Monitor & gather statistics/traces
• Ensure that the relevant IFCIDs (e.g., 0172 for deadlock) are enabled.
• Monitor for spikes in SQLCODE -911/-913 with deadlock reason codes.
• Collect lock contention statistics to see which objects/transactions are frequent culprits.
Tip: Regular review of these statistics lets you proactively identify and rectify "heavy hitters" in your environment before they cause user impact.
Tip 9 Review application design and concurrency model
Deadlocks often point to higher-level design issues rather than just DB configuration.
Tip:
• Have your developers review access patterns to shared tables.
• Where possible, partition work so heavy updates don't overlap on the same resources.
• Use optimistic concurrency (retry logic) where appropriate rather than pessimistic locks.
• For batch jobs, coordinate timing so high-contention jobs don't run concurrently.
Tip 10 Have an application retry strategy
Even with the best design you cannot eliminate all deadlocks. One of the processes will be rolled back by Db2 to break the cycle.
Tip: Build into your application logic a mechanism to catch SQLCODE -911/-913 (or equivalent) and retry the transaction after a random back-off. This reduces user impact and allows the system to recover gracefully.
------------------------------
Sandeep Jain
------------------------------