Start collaborating
IBM TechXchange 2025 conference is accepting
Session proposals through April 11
Hi all, We have MSC Actran running on two nodes running ok with firewalld disabled. Using the bundled mpiexec.hydra (export I_MPI_HYDRA_BOOTSTRAP=lsf) to integrate with lsf using blaunch instead of default ssh. I don’t know exact sequence, but when firewalld is enabled blaunch starts hydra_bstrap_proxy on node 2, nothing is started on node 1, nios process also starts on node 2 and listens on two ephemeral/random TCP ports, res on node 2 established connection to one of these but res on node 1 is unable to get pas SYN of three way TCP handshake, res process strace on node 1 shows: 18102 09:55:36 connect(3, {sa_family=AF_INET, sin_port=htons(46100), sin_addr=inet_addr("x.x.x.x")}, 16) = -1 EHOSTUNREACH (No route to host) tcpdump on node2 shows ICMP reject host prohibited generated by firewalld.. node1 res log shows: resRexecPjob: resPjobCallbackNIOS(46100) failed. In our lsf.conf we have explicitly set: LSF_NIOS_PORT_RANGE=47000-48000 But for whatever reason nios starts on random ephemeral TCP port number outside of this range? We configured LSF_NIOS_PORT_RANGE many months ago as we were experiencing firewall problems with bsub -K and that continues to work normally. Any ideas please why LSF_NIOS_PORT_RANGE is ignored? Best Regards - Colin
You may review following patches. The 2nd one should cover the 1st one.
Hello Yi, patch details appear to be an identical match to the problem we are seeing, will test install..
Many Thanks,