Originally posted by: SamuraiMark
One of my client LPARs cannot talk to my NFS server, so my users cannot log in. This happens every "once in a while". Situation Three LPARs, A (client), B (NFS) and C (NIM). Users cannot SSH into A because A cannot mount home directories from B. All three LPARs are in the same frame, and all three have IP addresses on the same VLAN.
Ping tests:
-
A and B can both ping C.
-
B (NFS) can ping A.
-
A (client) cannot ping B.
After "a while" the situation corrects itself and A can talk to B normally.
Ping tests from B to A and C:
# ping nim.empire.ca
PING kgnnim01.empire.ca (10.10.2.24): 56 data bytes
64 bytes from 10.10.2.24: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.10.2.24: icmp_seq=1 ttl=255 time=0 ms
64 bytes from 10.10.2.24: icmp_seq=2 ttl=255 time=0 ms
^C
--- kgnnim01.empire.ca ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0/0/0 ms
# ping nfs.empire.ca
PING kgnnfs01.empire.ca (10.10.2.25): 56 data bytes
64 bytes from 10.10.2.25: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.10.2.25: icmp_seq=1 ttl=255 time=0 ms
64 bytes from 10.10.2.25: icmp_seq=2 ttl=255 time=0 ms
^C
--- kgnnfs01.empire.ca ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0/0/0 ms
#
Ping tests from A to B and C:
# ping nim.empire.ca
PING kgnnim01.empire.ca (10.10.2.24): 56 data bytes
64 bytes from 10.10.2.24: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.10.2.24: icmp_seq=1 ttl=255 time=0 ms
64 bytes from 10.10.2.24: icmp_seq=2 ttl=255 time=0 ms
^C
--- kgnnim01.empire.ca ping statistics ---
3 packets transmitted, 3 packets received, 0% packet loss
round-trip min/avg/max = 0/0/0 ms
# ping nfs.empire.ca
PING kgnnfs01.empire.ca (10.10.2.25): 56 data bytes
^C
--- kgnnfs01.empire.ca ping statistics ---
17 packets transmitted, 0 packets received, 100% packet loss
#
In the time it too me to write this out, the situation is now corrected, and I did nothing to correct it:
# ping nfs.empire.ca
PING kgnnfs01.empire.ca (10.10.2.25): 56 data bytes
64 bytes from 10.10.2.25: icmp_seq=0 ttl=255 time=0 ms
64 bytes from 10.10.2.25: icmp_seq=1 ttl=255 time=0 ms
^C
--- kgnnfs01.empire.ca ping statistics ---
2 packets transmitted, 2 packets received, 0% packet loss
round-trip min/avg/max = 0/0/0 ms
#
My only thought is that the NFS server is temporarily blocking the client via firewalling, but I have no evidence to back that up. Not sure where to go from here. Next time it happens I'll run tcpdump on the NFS box to see if it can see the incoming pings and whether it is responding. Any pointers are appreciated.
- Mark