This usually indicates LSF services (especially mbatchd, mbschd, lim) on primary management node are busy. There are different reasons to cause this situation. You may consider to do some initial analysis as listed below.
- Review LSF log files to check any error logged around same time
- Use support tool (under LSF_TOP/10.1/util/support) to run some checks, such as network, name resolution, file system.
- Enable performance monitor (badmin perfmon view) to check batch system load
- Check resource usage (e.g. cpu and memory utilization) by LSF services
If the situation cannot recover by itself quickly, you should create support case to get assistance from Support.
#Support#SupportMigration#Spectrum