AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.

 View Only
Expand all | Collapse all

Strange ping behaviour - A can ping B, B cannot ping A

  • 1.  Strange ping behaviour - A can ping B, B cannot ping A

    Posted Tue May 07, 2013 03:18 PM

    Originally posted by: SamuraiMark


    One of my client LPARs cannot talk to my NFS server, so my users cannot log in. This happens every "once in a while". Situation Three LPARs, A (client), B (NFS) and C (NIM). Users cannot SSH into A because A cannot mount home directories from B. All three LPARs are in the same frame, and all three have IP addresses on the same VLAN.

    Ping tests:

    • A and B can both ping C.
    • B (NFS) can ping A.
    • A (client) cannot ping B.

    After "a while" the situation corrects itself and A can talk to B normally.

    Ping tests from B to A and C:

    # ping nim.empire.ca
    PING kgnnim01.empire.ca (10.10.2.24): 56 data bytes
    64 bytes from 10.10.2.24: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 10.10.2.24: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 10.10.2.24: icmp_seq=2 ttl=255 time=0 ms
    ^C
    --- kgnnim01.empire.ca ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max = 0/0/0 ms
    # ping nfs.empire.ca
    PING kgnnfs01.empire.ca (10.10.2.25): 56 data bytes
    64 bytes from 10.10.2.25: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 10.10.2.25: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 10.10.2.25: icmp_seq=2 ttl=255 time=0 ms
    ^C
    --- kgnnfs01.empire.ca ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max = 0/0/0 ms
    #

    Ping tests from A to B and C:

    # ping nim.empire.ca
    PING kgnnim01.empire.ca (10.10.2.24): 56 data bytes
    64 bytes from 10.10.2.24: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 10.10.2.24: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 10.10.2.24: icmp_seq=2 ttl=255 time=0 ms
    ^C
    --- kgnnim01.empire.ca ping statistics ---
    3 packets transmitted, 3 packets received, 0% packet loss
    round-trip min/avg/max = 0/0/0 ms
    # ping nfs.empire.ca
    PING kgnnfs01.empire.ca (10.10.2.25): 56 data bytes
    ^C
    --- kgnnfs01.empire.ca ping statistics ---
    17 packets transmitted, 0 packets received, 100% packet loss
    #

    In the time it too me to write this out, the situation is now corrected, and I did nothing to correct it:

    # ping nfs.empire.ca
    PING kgnnfs01.empire.ca (10.10.2.25): 56 data bytes
    64 bytes from 10.10.2.25: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 10.10.2.25: icmp_seq=1 ttl=255 time=0 ms
    ^C
    --- kgnnfs01.empire.ca ping statistics ---
    2 packets transmitted, 2 packets received, 0% packet loss
    round-trip min/avg/max = 0/0/0 ms
    #

    My only thought is that the NFS server is temporarily blocking the client via firewalling, but I have no evidence to back that up. Not sure where to go from here. Next time it happens I'll run tcpdump on the NFS box to see if it can see the incoming pings and whether it is responding. Any pointers are appreciated.

    - Mark



  • 2.  Re: Strange ping behaviour - A can ping B, B cannot ping A

    Posted Wed May 15, 2013 11:45 AM

    Originally posted by: GarlandJoseph


    Make sure you don't have a duplicate IP address (check aix error logs).  Also, since they are in the same lan, check your arp caches and compare the mac addresses when you can't ping. 

    So, say node a can't ping node b, check node a's arp cache value(mac address) of node B (arp -an or arp -d <node-b-hostname> ).  Record that value.  Then do arp -d <node-b-hostname>, then ping node b again and look at the value again.  It may still be the same.  In that case, the next time you can ping node b from node a, look in node a's arp cache again and compare the value when it failed against succeeding. 

    This is just one thing to do.