API Connect

 View Only
  • 1.  APIC Analytics Subsystem Performance Monitoring

    Posted 22 days ago

    Hi 

    Need  advice on how to effectively monitor the performance of the Analytics subsystem. Additionally, I'd like to understand how can best handle situations where performance starts to deteriorate.

     

    We are using API Connect 10.0.5 VMware deployment. Currently API Connect retains 3 months of analytics data in APIC servers. With that we would like to understand that in addition to monitoring CPU, memory, and storage, what other parameters or metrics can be tracked to proactively identify performance issues within the APIC Analytics subsystem. This would allow us to address potential slowdowns before they become critical. Additionally, they'd like to explore options for remediating performance issues, such as data archiving or alternative approaches.

     



    ------------------------------
    Guo Jun Qiao
    ------------------------------


  • 2.  RE: APIC Analytics Subsystem Performance Monitoring

    Posted 21 days ago

    Analytics provides a cluster management API to allow access to health metrics for analytics storage. For example, you can obtain CPU and memory information with the "cat/nodes" endpoint:

    https://apic-api.apiconnect.ibmcloud.com/v10/10.0.5.7.html#/IBMAPIConnectAnalyticsAPI_200/operation/%2F{analytics-service}%2Fcloud%2Fclustermgmt%2Fstorage%2Fcat%2Fnodes/get

    You can obtain disk usage with the "cat/allocation" endpoint:

    https://apic-api.apiconnect.ibmcloud.com/v10/10.0.5.7.html#/IBMAPIConnectAnalyticsAPI_200/operation/%2F{analytics-service}%2Fcloud%2Fclustermgmt%2Fstorage%2Fcat%2Fallocation/get

    The most important metrics for performance are heap memory and disk usage. If you are retaining 3 months (90 days) of data when analytics is running with a small amount of memory, you are very likely to encounter performance problems due to high heap usage leading to a lot of time spent garbage collecting. The mitigation here is to either reduce the data retention or switch to a deployment profile with more memory. Guidance on suitable retention periods is available here:

    https://www.ibm.com/docs/en/api-connect/10.0.5.x_lts?topic=analytics-configuring-data-retention-index-rollover-time-periods

    Disk usage should be kept below 80% to avoid the performance impact of data being moved away from nodes that are low on disk space.

    You might also want to consider configuring analytics to offload the data to an external system for longer retention so that you can reduce the resource load on analytics. Details regarding offload are available here:

    https://www.ibm.com/docs/en/api-connect/10.0.5.x_lts?topic=deployment-planning-offload-data-third-party-system



    ------------------------------
    Mark Taylor
    ------------------------------



  • 3.  RE: APIC Analytics Subsystem Performance Monitoring

    Posted 20 days ago

    Hi Mark

    Thank you for your reply.

    Is it possible for Fluent-bit to collect and forward the CPU, memory and disc space information to remote Syslog server?

    cat/nodes (CPU, memory)

    cat/allocation (disc space)



    ------------------------------
    Guo Jun Qiao
    ------------------------------



  • 4.  RE: APIC Analytics Subsystem Performance Monitoring

    Posted 16 days ago

    We have not tested using Fluent-bit and I confess I am not very familiar with that offering.

    I think it's worth mentioning that the APIs I highlighted are accessible via the API Connect toolkit CLI. Having briefly reviewed the Fluent-bit website, it might be possible to configure it to use this CLI to collect the metrics you are interested in and forward them.



    ------------------------------
    Mark Taylor
    ------------------------------



  • 5.  RE: APIC Analytics Subsystem Performance Monitoring

    Posted 15 days ago

    No it can't.

    It can basically only forward the container logs.

    If you were running on your own container environment you'd have more flexibility to access information like that from tools like Prometheus or instana.

    That info is available via the apic api/cli but you'd need to call the api, it's not going to be forwarded for you.



    ------------------------------
    Chris Dudley
    ------------------------------



  • 6.  RE: APIC Analytics Subsystem Performance Monitoring

    Posted 11 days ago

    Hi Chris, 

    According to Mark from his above reply , heap memory and disk space usage are the most important metrics for performance monitoring of the Analytics sub-system. When calling the /cat/nodes API for memory and /cat/allocation for disk space, will APIC check: 

    • Memory:
      • Total memory allocated to the server? 
      • OR Heap memory allocated specifically to the Analytics sub-system? 
    • Disk Space:
      • Total disk space usage of the server?
      • OR Disk space usage of the specific partition allocated to APIC (e.g., /path/to/apic)? 



    ------------------------------
    Guo Jun Qiao
    ------------------------------



  • 7.  RE: APIC Analytics Subsystem Performance Monitoring

    Posted 11 days ago

    Those are opensearch APIs, I suggest you refer to their documentation. APIC is just exposing the Opensearch api. Note that's only giving you information about analytics storage and not any of the other pods.



    ------------------------------
    Chris Dudley
    ------------------------------