Order Management & Fulfillment

Order Management & Fulfillment

Come for answers, stay for best practices. All we're missing is you.

 View Only

Preparing Your On-Premise IBM Sterling Order Management for Holiday Peak Season 2025

By Shoeb Bihari posted Wed October 29, 2025 02:03 PM

  

Holiday Readiness Guide: On-Premise IBM Sterling Order Management

A comprehensive guide to ensure your on-premise OMS environment is optimized for peak holiday season performance.

Understanding the Challenge

The holiday season brings unprecedented traffic to your order management system. Without proper preparation, this can lead to:

  • Performance degradation
  • System instability
  • Order processing delays
  • Poor customer experience

By following the recommendations in this guide, you can minimize these risks and ensure your on-premise OMS environment performs optimally during peak season.

IBM Sterling Order Management - Performance Guide

This document, along with the ongoing webinar series, provides insight into our proven best practices around customization patterns, configurations, testing, and ongoing housekeeping. As with all recommendations, be sure to test these out in a non-production environment to validate and tune to your specific environments and use cases.

Essential Diagnostics to Have Ready

Platform Diagnostics & Critical Metrics

Database (DB2)

  • Lock waits (MON_LOCKWAITS)
  • Current SQL (MON_CURRENT_SQL)
  • Connections (MON_GET_CONNECTION)
    • OMS identifies DB connection using yfs.app.identifyconnection=Y
  • Workload (MON_GET_WORKLOAD)
  • Transaction logs (MON_GET_TRANSACTION_LOG)
  • Package cache delta (MON_GET_PKG_CACHE_STMT)
DB2 Performance and Lock Analysis using Instana Observability

JVM Profiling Tools

  • IBM Health Center for Java
  • Oracle Java Flight Recorder (JFR)
  • Thread dump scripts (kill -3 PID or jstack -l PID)
  • Verify container CPU/Memory limit and usage
MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on Linux

Application Diagnostics

  • TIMER & SQLDEBUG Trace capabilities
  • YFS_STATISTICS_DETAIL table monitoring
  • UI component diagnostics (console & network logs)
MustGather for IBM Sterling Order Management: Performance Issues

Performance Optimization Strategies

Database Optimization

1. Review and Apply Indices

  • Ensure proper indexing on frequently queried tables
  • Pay special attention to YFS_PERSON_INFO, YFS_CUSTOMER, etc. for customer searches

2. Query Optimization

  • Avoid using UPPER function in queries
  • Implement shadow columns for case-insensitive searches
  • Clean up stale data from configuration tables

3. Connection Management

  • Monitor connection pools and adjust as needed
  • Identify and terminate long-running transactions

Application Tuning

1. API Usage

  • Avoid open-ended list API calls
  • Use adequate filters in all API requests
  • Optimize output templates to retrieve only necessary data

2. Tracing and Logging

  • Enable traces only at minimum required level
  • Do not apply any tracing during peak hours as it can degrade performance severely
  • Use traces for very short periods and disable immediately
  • Implement TraceTTL to automatically disable verbose logging

3. Admin Utilities

  • Restrict access to admin utilities
  • Avoid using DB Query Tool for large result sets

Critical Workload Adjustments

1. Agent Management

  • Halt complex order reallocation agents (like IBA)
  • Disable non-essential purges (There are some purges like Inventory that you must run all the time)
  • Ensure essential purge agents continue to run (e.g., Inventory Purge)

2. Capacity Management

  • Avoid disabling capacity by setting infinite thresholds
  • Configure appropriate capacity thresholds before peak season

3. Manual Activities

  • Pause manual activity through API tester or DB query client tools
  • Suspend manual reporting query execution
  • Use Data Extract agent or scheduled reports instead

Backlog Management

1. Identify Contention Points

  • Monitor database queries
  • Check JVM resources and GC overhead
  • Review container CPU utilization

2. Throttle Workloads

  • Reduce threads and JVM instances to avoid contention
  • Understand backlog by querying YFS_TASK_Q table
  • Monitor queue depth regularly

Sterling Intelligent Promising (SIP) Considerations

1. Availability Operations

  • Run availability snapshots only during off-peak hours
  • Use reduced snapshots (delivery-method based)
  • Apply minimum window of 30 days for zero-availability cleanup

2. Distribution Group Management

  • Schedule DG updates/sync during off-peak hours
  • Use Node + Item level DG sync for targeted updates

3. Network Configurations

  • Avoid recompute network availability during peak periods
  • Perform Node on/off activities only during low-traffic windows

Pre-Peak Preparation Checklist

  • Run Close Order agent to make orders eligible for purge
  • Aggressively purge tables to keep them lightweight
  • Review and disable non-critical Order Monitor rules
  • Perform end-to-end performance testing with peak workloads
  • Update to latest fixpack or patch level
  • Document escalation procedures and support contacts
  • Prepare runbooks with precise actions for common issues

Mitigation Tips

🔴 Server Unresponsive / JVM Crash

MustGather Documentation

Symptoms

  • High queue depth alert
  • Real time calls result in 500, 502 errors
  • Server down alerts
  • High thread utilization (WebContainer, DefaultExecutor)
  • High GC/Heap utilization
  • OOM, Stack Overflow exceptions

Diagnostics to Collect

  • 3x javacore/threaddump, 20 sec apart
  • heapdump and GC logs for OOM
  • linperf.sh for high CPU
  • Server logs
  • YFS_STATISTICS_DETAIL export

Immediate Actions

  • ✓ Restart the servers to mitigate the issue
  • ✓ Increase heap (Xmx) if server is going OOM

🔵 Database Slowness

Latches/Contention/Locks | Slow Queries

MustGather Documentation

Symptoms

  • High DB transaction log utilization
  • Excessive latches/contention
  • Excessive YFC0006, YFC0003 errors
  • Excessive DB Connections, high wait time
  • High DB resources (CPU, Memory, IO) utilization

Diagnostics to Collect

  • oms-db2collect.sh
  • db2support | AWR Report
  • 3x javacore/threaddump, 20 sec apart from application JVM
  • YFS_STATISTICS_DETAIL export

Immediate Actions

  • ✓ For locking, terminate the connection holding the lock from DB side
  • ✓ For slow query, capture EXPLAIN and ADVICE and apply indices

🟢 Application Slowness

API, Agent, Integration

MustGather Documentation

Symptoms

  • High queue depth alert
  • Real time calls result in 500, 502 errors
  • Inventory/Order lookup calls failing
  • High transaction backlog for schedule, release, etc.

Diagnostics to Collect

  • 3x javacore/threaddump, 20 sec apart
  • db2support | AWR Report
  • TIMER or SQLDEBUG trace for 5 minutes or single transaction
  • Application logs
  • YFS_STATISTICS_DETAIL export

Immediate Actions

  • ✓ Throttle workload
  • ✓ Stop unnecessary workload/servers to reduce load on DB, JMS or External System
  • ✓ Scale up if response time doesn't degrade

🟡 UI Slowness

Call Centre, Web Store, OrderHub, etc.

MustGather Documentation

Symptoms

  • CSR unable to lookup orders
  • Store Associates unable to perform shipment action
  • Browser hanging, or generic error message

Diagnostics to Collect

  • Screen recording/capture
  • HAR file (browser debug)
  • Application and LB access logs

Immediate Actions

  • ✓ Delete browser cache, cookies, and browser temporary files
  • ✓ Verify network connectivity

🟣 OMS Container Issues

MustGather Documentation

Symptoms

  • Container getting restarted frequently
  • Health check failures

Diagnostics to Collect

  • Deployment details (YAML) | describe pod
  • CPU/Memory request/limit
  • Container CPU/Memory utilization metrics

Immediate Actions

  • ✓ Restart the servers to mitigate the issue
  • ✓ Increase CPU/Memory requests

Engaging with Support

1. Team Preparation

  • Confirm support contacts for all system components
  • Define a 24x7 support schedule for peak hours
  • Establish clear communication channels and escalation procedures

2. Immediate Response Plan

  • Be ready to restart JVMs if necessary
  • Prepare throttling strategies for overloaded servers
  • Know how to reduce JVMs or threads quickly
  • Have procedures to capture necessary logs and diagnostics

Conclusion

Proper preparation is the key to a successful holiday season for your on-premise IBM Sterling Order Management system. By implementing these recommendations, you'll be well-positioned to handle the increased demand and provide a seamless experience for your customers during this critical business period.

Remember: Proactive monitoring and quick response to emerging issues are essential. Use the diagnostic tools and monitoring capabilities built into the system to identify and address potential problems before they impact your business operations.

For more detailed technical best practices, refer to the IBM Sterling OMS Performance Guide and Technical Best Practices documentation.

Sterling OMS Support 101 IBM Sterling Order Management - Performance Guide
0 comments
17 views

Permalink