AIX

AIX

Connect with fellow AIX users and experts to gain knowledge, share insights, and solve problems.


#Operatingsystems
#Servers
#AIX
#AIX
#AIX
#AIX
#AppPerformanceManagement
 View Only
Expand all | Collapse all

Strange NIC bandwidth issue

  • 1.  Strange NIC bandwidth issue

    Posted Wed October 26, 2011 12:12 AM

    Originally posted by: JohnMiller31


    I have a very old 7025-F50 running 4.3.3 that we're finally retiring. In trying to move about 54G of data off it I quickly realized it only had a 10/100 card in it. So I installed a 2975 10/100/1000 NIC to do the transfer faster than 3 hours. Here's where it gets weird, the switch the card is plugged into shows it's running 1000Mb full duplex, the NIC in AIX shows it's running at that speed (entstat), everything I have tells me it's running at GB speeds how the actual transfer is still coming out at about the 100Mb speed. Normally I run transfers with SCP so I thought maybe that was it so I tried with just plain old FTP and I still got the same speed.

    I am completely out of ideas, does anyone have any?

    Thanks,

    John


  • 2.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 07:05 AM

    Originally posted by: tony.evans


    What's the CPU load during the transfer?
    What's the disk I/O like?

    Are you sure the bottleneck is the NIC?


  • 3.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 12:32 PM

    Originally posted by: JohnMiller31


    Ok so I'll try to answer all the questions in one post.

    I'm transferring to a P5-520 with a GB NIC but I've also tried it to a Linux machine running a GB NIC and even a Linux machine on a ESXi host and all result in a transfer speed of about 5 megabytes/sec or about 40 megabits/sec . I've measured and verified it in the following ways. The first transfer attempt was with SCP and that reports speed but of course I don't like trusting that so I looked at the bandwidth on that switchport and it confirmed the speed (nothing else was going over that switchport or NIC). I also tried to eliminate SCP and just use plain old FTP, same approximate speed was the result. So I've eliminated the machine it's going to and the transfer program.

    I will have to check the CPU load later but as I recall it was not excessive and it's not IO bound on either end, similar disks on both machines.

    I've attached a copy of entstat on the adapter below, notice the data speed is 2000 which is 1000Mbit Full Duplex.

    ETHERNET STATISTICS (en2) :
    Device Type: IBM 10/100/1000 Base-T Ethernet PCI Adapter (14100401)
    Hardware Address: 00:04:ac:7c:dc:2d
    Elapsed Time: 0 days 14 hours 12 minutes 41 seconds

    Transmit Statistics: Receive Statistics:


    Packets: 43785930 Packets: 15915158
    Bytes: 62026122516 Bytes: 1004529846
    Interrupts: 675033 Interrupts: 15599639
    Transmit Errors: 0 Receive Errors: 0
    Packets Dropped: 0 Packets Dropped: 0
    Bad Packets: 0
    Max Packets on S/W Transmit Queue: 69
    S/W Transmit Queue Overflow: 0
    Current S/W+H/W Transmit Queue Length: 2

    Elapsed Time: 0 days 14 hours 12 minutes 40 seconds
    Broadcast Packets: 459 Broadcast Packets: 273536
    Multicast Packets: 2 Multicast Packets: 0
    No Carrier Sense: 0 CRC Errors: 0
    DMA Underrun: 0 DMA Overrun: 0
    Lost CTS Errors: 0 Alignment Errors: 0
    Max Collision Errors: 0 No Resource Errors: 0
    Late Collision Errors: 0 Receive Collision Errors: 0
    Deferred: 0 Packet Too Short Errors: 0
    SQE Test: 0 Packet Too Long Errors: 0
    Timeout Errors: 0 Packets Discarded by Adapter: 0
    Single Collision Count: 0 Receiver Start Count: 0
    Multiple Collision Count: 0
    Current HW Transmit Queue Length: 2

    General Statistics:

    No mbuf Errors: 0
    Adapter Reset Count: 0
    Adapter Data Rate: 2000
    Driver Flags: Up Broadcast Running
    Simplex AlternateAddress 64BitSupport
    ChecksumTCP ChecksumOffload PrivateSegment
    DataRateSet


  • 4.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 01:47 PM

    Originally posted by: Holgervk


    First, your average receive paket size is quite low. But that does not relate to the problem.

    To isolate the problem you have to avoid using ssh (adds complexity) and even disk-access.
    Do a ftp-session and type
    put "| dd if=/dev/zero bs=32k count=100000" /dev/null

    you can play around a bit with the blocksize and the count.

    the ftp-client will tell you the bandwith. Post the result here.


  • 5.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 04:09 PM

    Originally posted by: JohnMiller31


    First, my compliments on a diabolically brilliant way to test bandwidth.

    Here are the results, I ran it twice once from each machine to the other. Both of these results were confirmed on our switch monitoring.

    From F50 to 520:

    3276800000 bytes sent in 338.8 seconds (9444 Kbytes/s)

    From 520 to F50:

    3276800000 bytes sent in 183.8 seconds (1.741e+04 Kbytes/s)

    To separate the F50 and 520 performance from one another I ran the same tests from another machine (Linux on ESXi), the 520 performs about what I would expect. The F50 actually performs better than I would have though although it's still at a 1/3 of the throughput of the 520.

    From Linux to 520:

    3276800000 bytes (3.3 GB) copied, 39.3812 s, 83.2 MB/s
    226 Transfer complete.
    3276800000 bytes sent in 39.4 secs (83111.12 Kbytes/sec)

    From Linux to F50

    3276800000 bytes (3.3 GB) copied, 83.281 s, 39.3 MB/s
    226 Transfer complete.
    3276800000 bytes sent in 83.3 secs (39346.21 Kbytes/sec)


  • 6.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 04:10 PM

    Originally posted by: JohnMiller31


    Sorry should have said "test throughput" instead of "test bandwidth"


  • 7.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 04:25 PM

    Originally posted by: Holgervk


    ok, now we know that on ethernetlevel everything is fine.
    so lets move up to ip-level

    first, do a ftp localhost and test the bandwith

    then post the output of
    ifconfig -a
    no -a
    lsattr -El enxx #substitute enxx with the gigabit interface

    the f50 is 4.3.3, yes?
    what level is the 520?$


  • 8.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 05:40 PM

    Originally posted by: JohnMiller31


    Yes the F50 is 4.3.3., the 520 is 5.3.0.0

    The ftp test with localhost is:

    3276800000 bytes sent in 272.2 seconds (1.176e+04 Kbytes/s)

    I also ran it running through the IP of the adapter and got:

    3276800000 bytes sent in 350.5 seconds (9130 Kbytes/s)

    1. ifconfig -a
    en0: flags=e080863<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT>
    inet 10.1.254.2 netmask 0xffffff00 broadcast 10.1.254.255
    en2: flags=7e080863,10<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,CHECKSUM_OFFLOAD,CHECKSUM_SUPPORT,PSEG>
    inet 10.1.1.39 netmask 0xffffff00 broadcast 10.1.1.255
    tcp_sendspace 131072 tcp_recvspace 65536
    lo0: flags=e08084b<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT>
    inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
    inet6 ::1/0

    1. no -a
    extendednetstats = 0
    thewall = 655308
    sockthresh = 85
    sb_max = 1048576
    somaxconn = 1024
    clean_partial_conns = 0
    net_malloc_police = 0
    rto_low = 1
    rto_high = 64
    rto_limit = 7
    rto_length = 13
    inet_stack_size = 16
    arptab_bsiz = 7
    arptab_nb = 25
    tcp_ndebug = 100
    ifsize = 8
    arpqsize = 1
    ndpqsize = 50
    route_expire = 1
    send_file_duration = 300
    fasttimo = 200
    routerevalidate = 0
    nbc_limit = 393164
    nbc_max_cache = 131072
    nbc_min_cache = 1
    nbc_pseg = 0
    nbc_pseg_limit = 655308
    strmsgsz = 0
    strctlsz = 1024
    nstrpush = 8
    strthresh = 85
    psetimers = 20
    psebufcalls = 20
    strturncnt = 15
    pseintrstack = 12288
    lowthresh = 90
    medthresh = 95
    psecache = 1
    subnetsarelocal = 1
    maxttl = 255
    ipfragttl = 60
    ipsendredirects = 1
    ipforwarding = 0
    udp_ttl = 30
    tcp_ttl = 60
    arpt_killc = 20
    tcp_sendspace = 16384
    tcp_recvspace = 16384
    udp_sendspace = 9216
    udp_recvspace = 41920
    rfc1122addrchk = 0
    nonlocsrcroute = 0
    tcp_keepintvl = 150
    tcp_keepidle = 14400
    bcastping = 0
    udpcksum = 1
    tcp_mssdflt = 512
    icmpaddressmask = 0
    tcp_keepinit = 150
    ie5_old_multicast_mapping = 0
    rfc1323 = 0
    pmtu_default_age = 10
    pmtu_rediscover_interval = 30
    udp_pmtu_discover = 1
    tcp_pmtu_discover = 1
    ipqmaxlen = 100
    directed_broadcast = 1
    ipignoreredirects = 0
    ipsrcroutesend = 1
    ipsrcrouterecv = 0
    ipsrcrouteforward = 1
    ip6srcrouteforward = 1
    ip6_defttl = 64
    ndpt_keep = 120
    ndpt_reachable = 30
    ndpt_retrans = 1
    ndpt_probe = 5
    ndpt_down = 3
    ndp_umaxtries = 3
    ndp_mmaxtries = 3
    ip6_prune = 2
    ip6forwarding = 0
    multi_homed = 1
    main_if6 = 0
    main_site6 = 0
    site6_index = 0
    maxnip6q = 20
    llsleep_timeout = 3
    tcp_timewait = 1
    tcp_ephemeral_low = 32768
    tcp_ephemeral_high = 65535
    udp_ephemeral_low = 32768
    udp_ephemeral_high = 65535
    delayack = 0
    delayackports = {}
    sack = 0
    use_isno = 1
    tcp_newreno = 0
    tcp_nagle_limit = 65535

    mtu 1500 Maximum IP Packet Size for This Device True
    remmtu 576 Maximum IP Packet Size for REMOTE Networks True
    netaddr 10.1.1.39 Internet Address True
    state up Current Interface Status True
    arp on Address Resolution Protocol (ARP) True
    netmask 255.255.255.0 Subnet Mask True
    security none Security Level True
    authority Authorized Users True
    broadcast Broadcast Address True
    netaddr6 N/A True
    alias6 N/A True
    prefixlen N/A True
    alias4 N/A True
    rfc1323 N/A True
    tcp_nodelay N/A True
    tcp_sendspace N/A True
    tcp_recvspace N/A True
    tcp_mssdflt N/A True


  • 9.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 05:57 PM

    Originally posted by: Holgervk


    >I also ran it running through the IP of the adapter and got:
    I think that effectively run through localhost, too. Run
    route -n get IP_of_the_adapter
    to see.
    Actually, you should decreaste the count of the dd-command by about 80%. No need to run 200 or 300 second-tests. 30 seconds are enough.

    To analyze it further lets try rfc1323

    no -o rfc1323=1
    then quite any running ftp-session and do the transfer again
    if you do a test transfering TO the F50 you have to restart ftpd
    if it runs through inetd do
    stopsrc -s inetd
    startsrc -s inetd


  • 10.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 06:21 PM

    Originally posted by: JohnMiller31


    Yeah I figured it should run through the loopback but I think it was good to test. After changing to rfc1323 no real change:

    3276800000 bytes sent in 330.7 seconds (9677 Kbytes/s)


  • 11.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 06:30 PM

    Originally posted by: Holgervk


    for future tests, please use
    put "| dd if=/dev/zero bs=64k count=10000" /dev/null
    instead of
    put "| dd if=/dev/zero bs=32k count=100000" /dev/null

    test the speed to loopback and to the p520 and post the results

    you did not mention what connection you tested

    and run a
    vmstat 5
    in another shell and post the output


  • 12.  Re: Strange NIC bandwidth issue

    Posted Fri October 28, 2011 12:27 PM

    Originally posted by: JohnMiller31


    Ok sorry took a day away from it.

    Using loopback:

    655360000 bytes sent in 58.78 seconds (1.089e+04 Kbytes/s)

    kthr memory page faults cpu
    -----------
    ------------
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    1 1 55741 432 0 0 0 39 96 0 471 13775 123 7 3 89 2
    1 1 55809 127 0 0 0 55 57 0 446 69047 84 28 6 66 0
    2 1 55809 129 0 0 0 100 141 0 530 70230 1903 59 32 8 1
    5 1 56433 274 0 0 0 267 499 0 474 71998 1793 58 34 8 0
    3 1 55812 527 0 0 0 0 0 0 476 72809 1912 61 30 9 0
    2 1 55812 128 0 0 0 21 22 0 442 71760 1944 62 29 10 0
    2 1 55812 129 0 0 0 100 112 0 441 71319 1931 60 28 12 0
    3 1 55812 123 0 0 0 99 101 0 438 73663 1912 60 31 9 0
    3 1 55812 120 0 0 0 100 102 0 440 71268 1925 61 30 9 0
    2 1 55812 124 0 0 0 100 112 0 442 71222 1978 60 32 8 0
    2 1 55813 128 0 0 0 102 104 0 442 71470 1955 59 30 11 0
    2 1 55813 129 0 0 0 100 116 0 440 71152 1953 61 28 11 0
    2 1 55814 128 0 0 0 100 117 0 442 71273 1929 60 31 9 0
    2 1 55747 127 0 0 0 84 99 0 443 70477 1454 52 24 24 0
    Going to 520

    655360000 bytes sent in 66.98 seconds (9555 Kbytes/s)

    kthr memory page faults cpu
    -----------
    ------------
    r b avm fre re pi po fr sr cy in sy cs us sy id wa
    1 1 55512 130 0 0 0 39 96 0 471 13803 123 7 3 89 2
    1 1 55512 122 0 0 0 99 123 0 438 67765 73 28 6 66 0
    1 1 55623 120 0 0 0 122 195 0 940 66381 340 34 14 52 0
    2 1 55623 130 0 0 0 93 160 0 1901 63358 810 45 27 27 0
    2 1 55623 131 0 0 0 92 176 0 1928 63397 812 46 28 27 0
    2 1 55623 129 0 0 0 90 172 0 1890 63526 823 46 28 26 0
    2 1 55623 120 0 0 0 91 170 0 1893 63286 817 45 28 26 0
    3 1 55623 130 0 0 0 93 138 0 1905 63479 814 45 28 27 0
    2 1 55623 121 0 0 0 89 123 0 1906 63536 813 45 28 26 0
    2 1 55631 122 0 0 0 93 142 0 1860 63225 808 43 32 23 1
    2 1 55631 159 0 0 0 98 123 0 1840 63302 823 47 27 26 0
    2 1 55631 132 0 0 0 85 123 0 1866 62908 827 47 28 25 0
    2 1 55631 129 0 0 0 89 119 0 1870 63063 825 46 28 26 0
    3 1 55631 121 0 0 0 89 132 0 1879 62854 810 44 30 26 0
    3 1 55631 121 0 0 0 91 179 0 1892 62994 822 44 29 27 0
    2 1 55631 133 0 0 0 93 133 0 1913 63068 809 43 29 28 0
    2 1 55565 133 0 0 0 84 123 0 518 67530 114 28 8 64 0


  • 13.  Re: Strange NIC bandwidth issue

    Posted Mon October 31, 2011 04:13 AM

    Originally posted by: Holgervk


    obsiously your cpu is quite busy making pakets...

    try enabling large_send if the adapter supports it, check with lsattr -El enXX


  • 14.  Re: Strange NIC bandwidth issue

    Posted Thu October 27, 2011 11:05 PM

    Originally posted by: esv


    test removing the CHECKSUM_OFFLOAD from en2 and see what happens,


  • 15.  Re: Strange NIC bandwidth issue

    Posted Fri October 28, 2011 12:32 PM

    Originally posted by: JohnMiller31


    Thanks for the suggestion but no joy.


  • 16.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 07:06 AM

    Originally posted by: tony.evans


    Oh, two more questions.

    What device are you moving the data to and how fast is that machine's NIC (and CPU load, disk I/O etc.) and

    Can you give us actual data transfer numbers just so we can double check the obvious.


  • 17.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 08:31 AM

    Originally posted by: awojo


    Sure you're not hitting disk bottle necks?

    Reference my blog entry on testing PURE network speeds:
    https://www.ibm.com/developerworks/mydeveloperworks/blogs/AWojo/entry/pure_network_speed_testing?lang=en
    Doing that will be able to tell what your network is capable of transferring between the servers.


  • 18.  Re: Strange NIC bandwidth issue

    Posted Wed October 26, 2011 09:28 AM

    Originally posted by: Holgervk


    "100Mb speed" is 100mbyte/sec or 100mbit/sec? how do you measure this?