Order Management & Fulfillment

Come for answers, stay for best practices. All we're missing is you.

View Only

Back to Blog List

Preparing Your On-Premise IBM Sterling Order Management for Holiday Peak Season 2025

By Shoeb Bihari posted 2 days ago

As the holiday shopping season approaches, retailers with on-premise IBM Sterling Order Management deployments need to ensure their systems are optimized to handle the increased demand. This critical business period requires careful planning and preparation to maintain system stability and performance. Here's your comprehensive guide to holiday readiness for on-premise OMS environments.

Understanding the Challenge

The holiday season brings unprecedented traffic to your order management system. Without proper preparation, this can lead to:

- Performance degradation

- System instability

- Order processing delays

- Poor customer experience

By following the recommendations in this guide, you can minimize these risks and ensure your on-premise OMS environment performs optimally during peak season.

IBM Sterling Order Management - Performance Guide

This document, along with the ongoing webinar series, provides insight into our proven best practices around customization patterns, configurations, testing, and ongoing housekeeping. As with all recommendations, be sure to test these out in a non-production environment to validate and tune to your specific environments and use cases.

Essential Diagnostics to Have Ready: Platform Diagnostics & Critical Metrics

Database (DB2):

- Lock waits (MON_LOCKWAITS)

- Current SQL (MON_CURRENT_SQL)

- Connections (MON_GET_CONNECTION)

- OMS identifies DB connection using yfs.app.identifyconnection=Y

- Workload (MON_GET_WORKLOAD)

- Transaction logs (MON_GET_TRANSACTION_LOG)

- Package cache delta (MON_GET_PKG_CACHE_STMT)

DB2 Performance and Lock Analysis using Instana Observability

JVM Profiling Tools:

- IBM Health Center for Java

- Oracle Java Flight Recorder (JFR)

- Thread dump scripts (kill -3 PID or jstack -l PID)

- Verify container CPU/Memory limit and the usage

MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on Linux

Application Diagnostics:

- TIMER & SQLDEBUG Trace capabilities

- YFS_STATISTICS_DETAIL table monitoring

- UI component diagnostics (console & network logs)

MustGather for IBM Sterling Order Management: Performance Issues

Performance optimization strategies

Database Optimization

1. Review and Apply Indices:

- Ensure proper indexing on frequently queried tables

- Pay special attention to YFS_PERSON_INFO, YFS_CUSTOMER, etc for customer searches

2. Query Optimization:

- Avoid using UPPER function in queries

- Implement shadow columns for case-insensitive searches

- Clean up stale data from configuration tables

3. Connection Management:

- Monitor connection pools and adjust as needed

- Identify and terminate long-running transactions

Application Tuning

1. API Usage:

- Avoid open-ended list API calls

- Use adequate filters in all API requests

- Optimize output templates to retrieve only necessary data

2. Tracing and Logging:

- Enable traces only at minimum required level

- Do not apply any tracing during peak hours as it can degrade the performance severely.

- Use traces for very short periods and disable immediately

- Implement TraceTTL to automatically disable verbose logging

3. Admin Utilities:

- Restrict access to admin utilities

- Avoid using DB Query Tool for large result sets

Critical Workload Adjustments

1. Agent Management:

- Halt complex order reallocation agents (like IBA)

- Disable non-essential purges (There are some purges like Inventory, etc that you must run all the time)

- Ensure essential purge agents continue to run (e.g., Inventory Purge)

2. Capacity Management:

- Avoid disabling capacity by setting infinite thresholds

- Configure appropriate capacity thresholds before peak season

3. Manual Activities:

- Pause manual activity through API tester or DB query client tools

- Suspend manual reporting query execution

- Use Data Extract agent or scheduled reports instead

Backlog Management

1. Identify Contention Points:

- Monitor database queries

- Check JVM resources and GC overhead

- Review container CPU utilization

2. Throttle Workloads:

- Reduce threads and JVM instances to avoid contention

- Understand backlog by querying YFS_TASK_Q table

- Monitor queue depth regularly

Sterling Intelligent Promising (SIP) Considerations

1. Availability Operations:

- Run availability snapshots only during off-peak hours

- Use reduced snapshots (delivery-method based)

- Apply minimum window of 30 days for zero-availability cleanup

2. Distribution Group Management:

- Schedule DG updates/sync during off-peak hours

- Use Node + Item level DG sync for targeted updates

3. Network Configurations:

- Avoid recompute network availability during peak periods

- Perform Node on/off activities only during low-traffic windows

Pre-Peak Preparation Checklist

- [ ] Run Close Order agent to make orders eligible for purge

- [ ] Aggressively purge tables to keep them lightweight

- [ ] Review and disable non-critical Order Monitor rules

- [ ] Perform end-to-end performance testing with peak workloads

- [ ] Update to latest fixpack or patch level

- [ ] Document escalation procedures and support contacts

- [ ] Prepare runbooks with precise actions for common issues

Mitigation Tips:

Mustgather

Symptoms

Diagnostics

Mitigate

Server unresponsive / JVM Crash →

− High queue depth alert

− Real time calls result in 500, 502, etc. errors.

− Server down alerts

− High thread utilization (WebContainer, DefaultExecutor, etc.)

− High GC/Heap utilization.

− OOM, Stack Overflow exceptions

q 3x javacore/ threaddump, 20 sec apart

q heapdump, and GC logs for OOM

q linperf.sh for high CPU

q Server logs

q YFS_STATISTICS_DETAIL export

ü Restart the servers to mitigate the issue.

ü Increase heap (Xmx) if server is going OOM.

Database Slowness →

Latches/Contention/Locks

Slow Queries

− High DB transaction log utilization

− Excessive latches/contention

− Excessive YFC0006, YFC0003 errors

− Excessive DB Connections, high wait time

− High DB resources (CPU, Memory, IO) utilization

q oms-db2collect.sh

q db2support | AWR Report

q 3x javacore/ threaddump, 20 sec apart from application JVM

q YFS_STATISTICS_DETAIL export

ü For locking terminate the connection holding the lock from DB side.

ü For slow query, capture EXPLAIN an ADVICE and apply indices.

Application Slowness →

API, Agent, Integration

− High queue depth alert

− Real time calls result in 500, 502, etc. errors.

− Inventory/Order lookup calls failing

− High transaction backlog for schedule, release, etc.

q 3x javacore/ threaddump, 20 sec apart

q db2support | AWR Report

q TIMER or SQLDEBUG trace for 5 minutes or single transaction.

q Application logs

q YFS_STATISTICS_DETAIL export

ü Throttle workload

ü Stop unnecessary workload/servers to reduce load on DB, JMS or External System.

ü Scale up if response time doesn’t degrade.

UI Slowness →

Call Centre, Web Store

OrderHub, etc.

− CSR unable to lookup orders

− Store Associates unable to perform shipment action.

− Browser hanging, or generic error message.

q Screen recording/capture

q HAR file (browser debug)

q Application and LB access logs

ü Delete browser cache, cookies, and browser temporary files

ü Verify network connectivity

OMS Container Issues →

− Container getting restarted frequently.

− Health check failures

q Deployment details (YAML) | describe pod

q CPU/Memory request/limit

q Container CPU/Memory utilization metrics

ü Restart the servers to mitigate the issue.

ü Increase CPU/Memory requests

Engaging with Support

1. Team Preparation:

- Confirm support contacts for all system components

- Define a 24x7 support schedule for peak hours

- Establish clear communication channels and escalation procedures

2. Immediate Response Plan:

- Be ready to restart JVMs if necessary

- Prepare throttling strategies for overloaded servers

- Know how to reduce JVMs or threads quickly

- Have procedures to capture necessary logs and diagnostics

Proper preparation is the key to a successful holiday season for your on-premise IBM Sterling Order Management system. By implementing these recommendations, you'll be well-positioned to handle the increased demand and provide a seamless experience for your customers during this critical business period.

Remember that proactive monitoring and quick response to emerging issues are essential. Use the diagnostic tools and monitoring capabilities built into the system to identify and address potential problems before they impact your business operations.

For more detailed technical best practices, refer to the IBM Sterling OMS Performance Guide and Technical Best Practices documentation.

Sterling OMS Support 101
IBM Sterling Order Management - Performance Guide

0 comments

4 views

Permalink

https://community.ibm.com/community/user/blogs/shoeb-bihari/2025/10/29/preparing-your-on-premise-ibm-sterling-order-manag

Order Management & Fulfillment

Order Management & Fulfillment

Preparing Your On-Premise IBM Sterling Order Management for Holiday Peak Season 2025

By Shoeb Bihari posted 2 days ago

Understanding the Challenge

Essential Diagnostics to Have Ready: Platform Diagnostics & Critical Metrics

Database (DB2):

DB2 Performance and Lock Analysis using Instana Observability

JVM Profiling Tools:

MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on Linux

Application Diagnostics:

MustGather for IBM Sterling Order Management: Performance Issues

Performance optimization strategies

Database Optimization

Application Tuning

Critical Workload Adjustments

Backlog Management

Sterling Intelligent Promising (SIP) Considerations

Pre-Peak Preparation Checklist

Mitigation Tips:

Mustgather

Symptoms

Engaging with Support

Permalink

Additional
Resources

Office

Quick Links

Order Management & Fulfillment

Order Management & Fulfillment

Preparing Your On-Premise IBM Sterling Order Management for Holiday Peak Season 2025

By Shoeb Bihari posted 2 days ago

Understanding the Challenge

Essential Diagnostics to Have Ready: Platform Diagnostics & Critical Metrics

Database (DB2):

DB2 Performance and Lock Analysis using Instana Observability

JVM Profiling Tools:

MustGather: Performance, hang, or high CPU issues with WebSphere Application Server on Linux

Application Diagnostics:

MustGather for IBM Sterling Order Management: Performance Issues

Performance optimization strategies

Database Optimization

Application Tuning

Critical Workload Adjustments

Backlog Management

Sterling Intelligent Promising (SIP) Considerations

Pre-Peak Preparation Checklist

Mitigation Tips:

Mustgather

Symptoms

Engaging with Support

Permalink

Additional Resources

Office

Quick Links

Additional
Resources