AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.

 View Only
Expand all | Collapse all

Strange localhost ping problem

  • 1.  Strange localhost ping problem

    Posted Sat October 24, 2009 05:08 PM

    Originally posted by: SystemAdmin


    Only on one server do i see this behavior. The response time varies widely

    (/)> ping localhost
    PING loopback: (127.0.0.1): 56 data bytes
    64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=172 ms
    64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=4 ttl=255 time=100 ms
    64 bytes from 127.0.0.1: icmp_seq=5 ttl=255 time=1000 ms
    64 bytes from 127.0.0.1: icmp_seq=6 ttl=255 time=332 ms
    64 bytes from 127.0.0.1: icmp_seq=7 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=8 ttl=255 time=11 ms
    64 bytes from 127.0.0.1: icmp_seq=9 ttl=255 time=1 ms
    64 bytes from 127.0.0.1: icmp_seq=10 ttl=255 time=1 ms
    64 bytes from 127.0.0.1: icmp_seq=11 ttl=255 time=58 ms
    64 bytes from 127.0.0.1: icmp_seq=12 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=13 ttl=255 time=884 ms
    64 bytes from 127.0.0.1: icmp_seq=14 ttl=255 time=28 ms
    64 bytes from 127.0.0.1: icmp_seq=15 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=16 ttl=255 time=555 ms

    ----loopback PING Statistics----
    17 packets transmitted, 17 packets received, 0% packet loss
    round-trip min/avg/max = 0/184/1000 ms

    any other server has 0 ms response.

    (/)> ping localhost
    PING loopback: (127.0.0.1): 56 data bytes
    64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=4 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=5 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=6 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=7 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=8 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=9 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=10 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=11 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=12 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=13 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=14 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=15 ttl=255 time=0 ms

    ----loopback PING Statistics----
    16 packets transmitted, 16 packets received, 0% packet loss
    round-trip min/avg/max = 0/0/0 ms

    The issue with the first server is that the localhost response is only variable when there is very low activity on the server/lo0. it is using about .2 cpu out of 8. If the system is busy with activity with SAP access the local DB via localhost, the ping localhost will have 0ms response time every ping.

    It could be that this is not an issue, but I only see it on this one server and this server has very variable DB access times during low usage times. I.E. poor db response correlates with bad local host ping times and low usage times.

    I don't know if there a cause and affect between the 2 issues, but i'd like to eliminate the odd localhost ping behavior and see if that helps the db response issue.

    Any ideas?


  • 2.  Re: Strange localhost ping problem

    Posted Sat October 24, 2009 05:18 PM

    Originally posted by: SystemAdmin


    system is 5.3, 5300-08-03-0831


  • 3.  Re: Strange localhost ping problem

    Posted Mon October 26, 2009 05:34 AM

    Originally posted by: tony.evans


    Does it do this with SAP and the DB down?


  • 4.  Re: Strange localhost ping problem

    Posted Thu December 03, 2009 04:33 AM

    Originally posted by: purgatory


    Was there ever a solution to this problem? I just had one of my servers run into this same localhost ping issue, and our only solution was to reboot, which seems to have solved the issue for the time being. I would like to know if there is a better solution or if there is risk of this happening again.

    5300-10-01-0921
    checked 'netstat -rn' before and after reboot and their identical so if it's a route issue it's not showing with the normal command
    Using etherchannel 2primary adapters with a backup adapter, tried failing back and forth with zero luck.
    server is part of a Oracle RAC cluster so i'm curious if that has anything to do with the problem.

    64 bytes from 127.0.0.1: icmp_seq=196 ttl=255 time=47 ms
    64 bytes from 127.0.0.1: icmp_seq=197 ttl=255 time=160 ms
    64 bytes from 127.0.0.1: icmp_seq=198 ttl=255 time=161 ms
    64 bytes from 127.0.0.1: icmp_seq=199 ttl=255 time=163 ms
    64 bytes from 127.0.0.1: icmp_seq=200 ttl=255 time=799 ms
    64 bytes from 127.0.0.1: icmp_seq=201 ttl=255 time=1000 ms
    64 bytes from 127.0.0.1: icmp_seq=202 ttl=255 time=551 ms
    64 bytes from 127.0.0.1: icmp_seq=203 ttl=255 time=1000 ms
    64 bytes from 127.0.0.1: icmp_seq=204 ttl=255 time=688 ms
    64 bytes from 127.0.0.1: icmp_seq=205 ttl=255 time=385 ms
    64 bytes from 127.0.0.1: icmp_seq=206 ttl=255 time=65 ms
    64 bytes from 127.0.0.1: icmp_seq=207 ttl=255 time=301 ms
    64 bytes from 127.0.0.1: icmp_seq=208 ttl=255 time=192 ms


  • 5.  Re: Strange localhost ping problem

    Posted Thu December 03, 2009 07:06 PM

    Originally posted by: dukessd


    sounds a bit like a dns problem.

    make sure your hosts file has local lookup addresses:

    127.0.0.1 loopback localhost # loopback (lo0) name/address

    and stick "hosts=local,bind" or "hosts=local4,bind4" in your /etc/netsvc.conf file - while there check for other statements that may be a problem...


  • 6.  Re: Strange localhost ping problem

    Posted Wed December 09, 2009 04:29 PM

    Originally posted by: purgatory


    I had originally checked /etc/hosts and that looks good. /etc/netsvc.conf looks good as well with the only uncommented line being "hosts=local,bind4".
    The response I received from IBM was less then helpful since they didn't get to see the server at the time the problem was occurring. If i can get a better answer from IBM I'll post it.

    see if i can add a few details that may jog peoples memory or thoughts on this issue.

    ping to loopback or local host had return times between 10ms-1000ms, application teams noticed extreme slow down on local run jobs.

    primary interface is an etherchannel that includes en0+en1 for primary and en2 for backup all pings to external addresses worked beautifully only localhost or loopback pings chocked

    etherchannel uses round_robin with default hash all other options are default.


  • 7.  Re: Strange localhost ping problem

    Posted Mon December 14, 2009 05:49 PM

    Originally posted by: Kruso


    Hi,

    I had the same problem and IBM recommended to lo_perf to 0 despite what performance tunning documents do say.

    Kruso.


  • 8.  Re: Strange localhost ping problem

    Posted Fri May 21, 2010 06:27 AM

    Originally posted by: appr_rules


    Hi gentlemen,
    i have the same problem, with an aix lpar in version 5.3TL06SP06, on a power6 hardware.
    Did you found a solution with this important problem ?

    Thank's in advance,
    Best Regards,
    Olivier.


  • 9.  Re: Strange localhost ping problem

    Posted Thu June 03, 2010 06:49 AM

    Originally posted by: kicky


    Hello
    Same problem on a power5 system running AIX 5300-08-03-0831.
    Ping response is slow on ALL local interfaces that are routed through lo0, but is OK for all pings to external systems:

    ping $remote
    ..
    64 bytes from 192.168.124.5: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 192.168.124.5: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 192.168.124.5: icmp_seq=2 ttl=255 time=0 ms
    64 bytes from 192.168.124.5: icmp_seq=3 ttl=255 time=0 ms

    ping $local
    ..
    64 bytes from 192.168.124.25: icmp_seq=0 ttl=255 time=333 ms
    64 bytes from 192.168.124.25: icmp_seq=1 ttl=255 time=12 ms
    64 bytes from 192.168.124.25: icmp_seq=2 ttl=255 time=0 ms
    64 bytes from 192.168.124.25: icmp_seq=3 ttl=255 time=49 ms
    64 bytes from 192.168.124.25: icmp_seq=4 ttl=255 time=70 ms

    One direct consequence: all NFS cross-mounts are terribly slow.

    Anyone got to the root cause yet? Or a solution other than reboot?


  • 10.  Re: Strange localhost ping problem

    Posted Sun June 06, 2010 06:15 PM

    Originally posted by: cd3lgado


    Hi

    Looks like SW problems to me. Have you applied the latest Tech Level available ??


  • 11.  Re: Strange localhost ping problem

    Posted Mon June 07, 2010 07:38 AM

    Originally posted by: kicky


    Hi
    I am also convinced it is a SW problem. Unfortunately we're not at liberty to install updates without testing. On the other hand, the problem is very rare for us, we can't recreate it at will, so even if we apply fixes, we can't really test.
    I am looking for a "in situ" solution, if anyone has it.


  • 12.  Re: Strange localhost ping problem

    Posted Mon May 30, 2016 09:28 AM

    Originally posted by: patricio.rodas


    Hi, I'll reply in spanish sorry. (issue fixed in my case)

    Ambiente:

    Servidor con AIX 5300-12-09-1341

    Base de datos Oracle 9i (RAC) de dos nodos

    4 interfaces de red (1 usuarios, 2 bakcups, 3 InterConnect, 4 adicional para datagard)

    Sintoma:

    El nodo 1 expermimenta lentitud en algunos procesos bash que utilizan base de datos.

    Solución:

    Revisar /etc/hosts  - Ej.

    127.0.0.1               loopback localhost      # loopback (lo0) name/address

    172.17.1.71   servidor1

    y /etc/netsvc.conf  ( el orden es 1ro resolución local y luego utiliza DNS)

    hosts=local,bind4

     

    En mi ambiente hubo una particularidad, en el servidor teníamos 4 interfaces de red, y al hacer ping a cualquiera de las interfaces el tiempo de respuesta era el mismo 100ms, por lo que se descarto problema del equipo de comunicaciones así como también se descarto problema de tarjetas del equipo (porque eran diferentes tarjetas incluso una interfaz es de 10GB).

     

    La interfaz "4 adicional para datagard" se la utilizo para un proyecto en forma temporal, por lo que  estaba configurada en el listener de la base de datos, sin embargo esta interfaz fue eliminada del servidor pero no se había eliminado de la configuración de listener.

     

    Solución definitiva:

    Se elimino de la configuración de listener la referencia al uso de la interfaz "4 adicional para datagard"  ( lo hizo el DBA)

    Se elimino la interfaz de red "1 usuarios" (172.17.1.71)  rmdev -dl

    Se reconfiguro la interfaz de red cfgmgr.

     

    Hsta el momento vamos 7 días con tiempo de respuesta de 0ms

    64 bytes from 127.0.0.1: icmp_seq=0 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=1 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=2 ttl=255 time=0 ms
    64 bytes from 127.0.0.1: icmp_seq=3 ttl=255 time=0 ms