AIX

 View Only
  • 1.  monitoring tools and utilities

    Posted Mon November 08, 2021 07:32 AM
    Would appreciate any suggestions on useful monitoring tools and utilities to use on AIX.  I've been looking through what's available with AIX toolbox.  Other suggestions, especially Open Source ones appreciated.  I'm thinking of porting and/or building some utility programs from source.  Suggestions for FLOSS programs that may work well on AIX would be very helpful.  Thanks.

    ------------------------------
    Laura Michaels
    ------------------------------


  • 2.  RE: monitoring tools and utilities

    Posted Mon November 08, 2021 08:14 AM
    On Mon, Nov 08, 2021 at 12:31:35PM +0000, Laura Michaels via IBM Community wrote:
    > Would appreciate any suggestions on useful monitoring tools and
    > utilities to use on AIX. I've been looking through what's available
    > with AIX toolbox. Other suggestions, especially Open Source ones
    > appreciated. I'm thinking of porting and/or building some utility
    > programs from source. Suggestions for FLOSS programs that may work
    > well on AIX would be very helpful. Thanks.

    AIX has a few ways built in to monitor already. I recommend using the
    ODM and have errdaemon forward errpt messages directly to email and
    syslog. This is really low level and only depends on working email.

    https://adamssystems.nl/posts/simple-error-reporting/

    RSCT and RMC can also do resource monitoring and it's AIX native,
    however it's fairly uncommon to set this up. I've only tried it with
    filesystems.

    https://www.ibm.com/support/pages/node/630479

    https://www.ibm.com/support/pages/node/718919

    Don't forget to configure your HMC to mail event alerts too!

    Unfortunately I find email is the bottleneck for most alerting, and
    often breaks upstream. YMMV. Maybe consider using an alternate channel
    like sendxmpp on the monitor server.

    As to the open source world, I have customers which are sending syslog
    from AIX to Linux syslog servers, ELK and Splunk. The errpt forwarding
    is very useful there. I really like centralizing with syslog-ng's
    ability to save syslog data as a hierarchy (ie:
    /var/log/HOSTS/$hostname/YYYY/MM/DD/loglevel) and using tools like
    logtail or logmuncher to do reporting. No web services involved. ELK
    and Splunk are cool, but they are large applications which require
    significant customization.

    I've had excellent experiences with Nagios and it's derivatives. The
    plugins can just be shell scripts and quick to make ad-hoc on AIX. I
    do prefer to do cron based "push" of checks to Nagios (ie: NCSA)
    instead of running a root level daemon (ie: NRPE). Common Perl
    libraries for submission run fine on AIX. Don't forget to set the
    checks on Nagios to look for missing submissions (ie: stale
    services).

    For performance monitoring, nothing beats NMON and the new nmonchart
    tool. I always setup NMON on every system logging to a dedicated 2GB
    filesystem in /var/nmon. The HMC also has a built in performance
    monitor now, make sure to enable it. I've seen customers successfully
    use LPAR2RRD and Ganglia, however I wouldn't rely on setting those up
    from the toolkit. AIX's snmpdv3 also supports HOSTMIB, so some
    external monitoring tools can query information (ie: disks and
    processes) once you configure it.

    Let us know what else you discover or what you choose to implement.

    ------------------------------------------------------------------
    Russell Adams Russell.Adams@AdamsSystems.nl
    Principal Consultant Adams Systems Consultancy
    http://adamssystems.nl/




  • 3.  RE: monitoring tools and utilities

    Posted Tue November 09, 2021 09:20 AM
    Excellent summary.  First thing I would add is to use Linux, if you can in your environment, for the monitoring server.  It opens up more opportunities for various software unless you like to compile your own on AIX.  The Toolbox is getting better all the time, but it can't duplicate every piece of FLOSS software.

    I've used ganglia and nagios, and even big brother back in the day.  We currently use icinga ( a nagios fork, although I haven't been able to get v2 to work on AIX ).

    For nmon data, it is transferred to a web server and we use nmon2web so that we can view the graphs via any web browser.  It is a relative of lpar2rrd.  On the same web server, we use stor2rrd for viewing performance graphs for our SAN.  It can also be used for fibre switches as well.

    Many of our other tools are home-grown scripts whose html output is emailed to our sysadmin distribution list.  These took a good amount of time to develop over the years.  The emails are also stored in a generic mailbox for a period of time for audit purposes.

    Some examples are a system summary report including df output, failed logins, specific logins and processes, and errpt entries from last report.  These are also colorized so that concerns can be easily seen.  We have a small environment and this could be impractical in an environment that has hundreds of servers.

    Others summarizes the backup activity.  Again key measures and concerns are colorized.

    ------------------------------
    Bruce
    ------------------------------



  • 4.  RE: monitoring tools and utilities

    Posted Mon November 08, 2021 09:59 AM
    Nagios/NRPE
    nmon/njmon

    both of those are fairly easy to get up and running. Writting custom NRPE plugins is easy.

    ------------------------------
    Anthony Cascianelli
    ------------------------------



  • 5.  RE: monitoring tools and utilities

    Posted Mon November 08, 2021 02:41 PM
    I found an old version of top with AIX support and it seems to build and run okay on AIX.  I also saw some patches to htop to get an older version to work on AIX.  Moving the patches to the current version along with some new patches to fix some differences between AIX and other systems, htop seems to be up and running.  They both probably need further cleaning up and patching, but as a proof of concept, either seems to work on AIX.

    ------------------------------
    Laura Michaels
    ------------------------------



  • 6.  RE: monitoring tools and utilities

    Posted Tue November 09, 2021 07:16 AM
    Hi Laura,

    For metrics / performance I have some suggestions;

    I have been happy w. Telegraf on Linux, and I can see there's an AIX version available - I have not tested it on AIX though;
    https://www.power-devops.com/telegraf

    A personal (OSS) project of mine, is for visualizing the performance metrics directly from the HMC (without the need of any agents) - through InfluxDB -> Grafana;
    https://bitbucket.org/mnellemann/hmci/ 


    I would also recommend forwarding ODM / errlogging to a centralized syslog server.


    Best regards,
    Mark

    ------------------------------
    Mark Nellemann
    ------------------------------



  • 7.  RE: monitoring tools and utilities

    Posted Tue November 09, 2021 08:54 AM

    Hi Laura,


    Galileo Performance Explorer has been available for AIX since 2007, and is a great option, with real-time alerting, correlation to related assets (switches, storage, etc.), dynamic 3D maps, and original data points forever (including every unique PID).


    You can check it out for free.

    https://galileosuite.com 


    I can also help answer any questions you might have directly.


    Thanks!

    Bob


    Bob Bender | Director of Galileo Support
    mobile 610.680.8130 | bbender@GalileoSuite.com