App Connect

 View Only
  • 1.  Out of memory in CP4I- ACE

    Posted Tue August 08, 2023 12:21 PM

    Product name: APP Connect
    Version: 2022
     
    We are doing a complex query the flow was deployed in an ACE Integration server container running on OpenShift Cluster. in the flow The query runs multiple times, each time it retrieves 50 database records and put them as a batch into a queue. What we noticed was that the memory utilization had been going up gradually and eventual they pod run out of memory. As a result, the container in the pod was restarted by K8s. 
     
    We did not find any issue in the flow that could cause the memory from going up consistently. Is there any way we can profile why the memory is going up? Is there any known issue with integration server for memory leak? 
     
     
    By the way flow is working fine in ACE local IS. 

    TOP Command on POD:


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               11m          1542Mi

    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               7m           1601Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               6m           1651Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               11m          1659Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               5m           1664Mi

    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               8m           1697Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               8m           1697Mi

    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               14m          1720Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               14m          1720Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               9m           1783Mi


    PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'

    is-01-publisher-is-8467c9b4b7-8vklv                               11m          1879Mi

    Pod log:

    2023-08-07T18:10:40.678Z Metrics: Connected
    2023-08-07T19:19:00.367Z Metrics: Using CA Certificate folder /home/aceuser/adminssl
    ERROR: StderrLogging has stopped
    ERROR: StdoutLogging has stopped
    ERROR: bgCmd.ReturnCode: 0
    ERROR: bgCmd.ReturnCode: 0
    2023-08-07T19:19:00.367Z Metrics: Adding Certificate /home/aceuser/adminssl/ca.crt.pem to CA pool
    2023-08-07T19:19:00.368Z Metrics: Adding Certificate /home/aceuser/adminssl/tls.crt.pem to CA pool
    2023-08-07T19:19:00.370Z Metrics: Using provided cert and key for mutual auth
    2023-08-07T19:19:00.370Z Metrics: ACE_ADMIN_SERVER_NAME is is-01-publisher
    2023-08-07T19:19:00.370Z Metrics: Connecting to wss://localhost:7600/ for statistics gathering
    2023-08-07T19:19:00.370Z Metrics: Using provided webusers/admin-users.txt, connecting with user authentication
    2023-08-07T19:19:00.371Z Metrics: Error retrieving session: Get "https://localhost:7600/": dial tcp [::1]:7600: connect: connection refused
    2023-08-07T19:19:00.371Z Metrics: Connecting to wss://localhost:7600/ with SSL
    2023-08-07T19:19:00.371Z Metrics: Error calling ace admin server webservice endpoint dial tcp [::1]:7600: connect: connection refused
    2023-08-07T19:19:00.371Z Metrics: If this repeats then check you have assigned enough memory to your Pod and you aren't running out of memory
    2023-08-07T19:19:00.371Z Metrics: Sleeping for 5 seconds before retrying to connect to metrics...
    2023-08-07T19:19:05.371Z Metrics: Using CA Certificate folder /home/aceuser/adminssl
    2023-08-07T19:19:05.372Z Metrics: Adding Certificate /home/aceuser/adminssl/ca.crt.pem to CA pool
    2023-08-07T19:19:05.372Z Metrics: Adding Certificate /home/aceuser/adminssl/tls.crt.pem to CA pool
    2023-08-07T19:19:05.373Z Metrics: Using provided cert and key for mutual auth
    2023-08-07T19:19:05.373Z Metrics: ACE_ADMIN_SERVER_NAME is is-01-publisher
    2023-08-07T19:19:05.373Z Metrics: Connecting to wss://localhost:7600/ for statistics gathering
    2023-08-07T19:19:05.373Z Metrics: Using provided webusers/admin-users.txt, connecting with user authentication
    2023-08-07T19:19:05.374Z Metrics: Error retrieving session: Get "https://localhost:7600/": dial tcp [::1]:7600: connect: connection refused
    2023-08-07T19:19:05.374Z Metrics: Connecting to wss://localhost:7600/ with SSL
    2023-08-07T19:19:05.374Z Metrics: Error calling ace admin server webservice endpoint dial tcp [::1]:7600: connect: connection refused
    2023-08-07T19:19:05.374Z Metrics: If this repeats then check you have assigned enough memory to your Pod and you aren't running out of memory
    2023-08-07T19:19:05.374Z Metrics: Sleeping for 5 seconds before retrying to connect to metrics...
    2023-08-07T19:19:10.374Z Metrics: Using CA Certificate folder /home/aceuser/adminssl
    2023-08-07T19:19:10.374Z Metrics: Adding Certificate /home/aceuser/adminssl/ca.crt.pem to CA pool
    2023-08-07T19:19:10.375Z Metrics: Adding Certificate /home/aceuser/adminssl/tls.crt.pem to CA pool
    2023-08-07T19:19:10.376Z Metrics: Using provided cert and key for mutual auth
    2023-08-07T19:19:10.376Z Metrics: ACE_ADMIN_SERVER_NAME is is-01-publisher
    2023-08-07T19:19:10.376Z Metrics: Connecting to wss://localhost:7600/ for statistics gathering
    2023-08-07T19:19:10.376Z Metrics: Using provided webusers/admin-users.txt, connecting with user authentication
    2023-08-07T19:19:10.376Z Metrics: Error retrieving session: Get "https://localhost:7600/": dial tcp [::1]:7600: connect: connection refused
    2023-08-07T19:19:10.376Z Metrics: Connecting to wss://localhost:7600/ with SSL
    2023-08-07T19:19:10.377Z Metrics: Error calling ace admin server webservice endpoint dial tcp [::1]:7600: connect: connection refused
    2023-08-07T19:19:10.377Z Metrics: If this repeats then check you have assigned enough memory to your Pod and you aren't running out of memory
    2023-08-07T19:19:10.377Z Metrics: Sleeping for 5 seconds before retrying to connect to metrics...
    2023-08-07T19:19:14.805Z Signal received: terminated
    2023-08-07T19:19:14.805Z Stopping metrics gathering
    2023-08-07T19:19:14.805Z Stopping Integration Server
    2023-08-07T19:19:14.805Z Integration Server stopped
    2023-08-07T19:19:14.805Z Contents of log directory
    2023-08-07T19:19:14.808Z total 12
    -rw-rw----. 1 1001350000 1001350000  83 Aug  7 18:10 integration_server.is-01-publisher.exceptionLog.txt
    -rw-rw----. 1 1001350000 1001350000 355 Aug  7 18:10 integration_server.is-01-publisher.trace.0.txt
    -rw-rw----. 1 1001350000 1001350000 355 Aug  7 18:10 integration_server.is-01-publisher.userTrace.0.txt

    2023-08-07T19:19:14.808Z If you want to stop the container shutting down to enable retrieval of these files please set the environment variable "MQSI_PREVENT_CONTAINER_SHUTDOWN=true"
    2023-08-07T19:19:14.808Z If you are running under kubernetes you will also need to disable the livenessProbe
    2023-08-07T19:19:14.808Z Log checking complete
    2023-08-07T19:19:14.808Z Commands API server stopped
    2023-08-07T19:19:14.808Z Integration commands API stopped
    2023-08-07T19:19:14.808Z Shutdown complete
    PS D:\Users\AROY3\Desktop\Horizon\oc\client>



    ------------------------------
    Atanu Roy
    ------------------------------


  • 2.  RE: Out of memory in CP4I- ACE

    Posted Wed August 09, 2023 09:36 AM
    Edited by David Brickell Wed August 09, 2023 10:18 AM

    ACE and IIB before it, do no release memory until restart. This doesn't mean it is expected for memory to grow forever. It does mean that memory should grow and eventually plateau as objects like parsers are reused. This helps keep ACE processing fast. 
    You can try setting freeMasterParsers to true which will have ACE free the memory from parsers after each transaction with the downside being new memory is allocated with each transaction in turn as new parsers are created again. This can be done in the server.conf.yaml file.

    ParserManager:
        #parserWarningThreshold: 1000   
        #fieldWarningThreshold: 100000  
        #freeMasterParsers: true


    Ideally, you will want to run this application on prem and determine what the memory requirements are for it. 
    If you run this on prem and see the same unbound memory growth, then that would be a good sign of a leak and you would want to open a case with support.
    If you see that the memory eventually settles, even if high, then this would indicate that it is not a leak and you would want to look at optimizations you can make to your application to reduce the memory foot print.

    When you have a good indication of how much memory is expected for the application to run, you can adjust the memory limits for the pod deployment to allow for the performance needs.



    ------------------------------
    David Brickell
    ------------------------------



  • 3.  RE: Out of memory in CP4I- ACE

    Posted Thu August 10, 2023 01:09 PM
      |   view attached

     After clean restart Memory was 5gb  Max it has reached 7.8Gb while processing  and then constant at 7.5 for max of the time. So I assume 3.5 GB memory might be needed in the pod I have allocated 2 GB.

     



    ------------------------------
    Atanu Roy
    ------------------------------

    Attachment(s)

    docx
    note.docx   404 KB 1 version