After clean restart Memory was 5gb Max it has reached 7.8Gb while processing and then constant at 7.5 for max of the time. So I assume 3.5 GB memory might be needed in the pod I have allocated 2 GB.
Original Message:
Sent: Wed August 09, 2023 09:35 AM
From: David Brickell
Subject: Out of memory in CP4I- ACE
ACE and IIB before it, do no release memory until restart. This doesn't mean it is expected for memory to grow forever. It does mean that memory should grow and eventually plateau as objects like parsers are reused. This helps keep ACE processing fast.
You can try setting freeMasterParsers to true which will have ACE free the memory from parsers after each transaction with the downside being new memory is allocated with each transaction in turn as new parsers are created again. This can be done in the server.conf.yaml file.
ParserManager: #parserWarningThreshold: 1000 #fieldWarningThreshold: 100000 #freeMasterParsers: true
Ideally, you will want to run this application on prem and determine what the memory requirements are for it.
If you run this on prem and see the same unbound memory growth, then that would be a good sign of a leak and you would want to open a case with support.
If you see that the memory eventually settles, even if high, then this would indicate that it is not a leak and you would want to look at optimizations you can make to your application to reduce the memory foot print.
When you have a good indication of how much memory is expected for the application to run, you can adjust the memory limits for the pod deployment to allow for the performance needs.
------------------------------
David Brickell
Original Message:
Sent: Tue August 08, 2023 08:54 AM
From: Atanu Roy
Subject: Out of memory in CP4I- ACE
Product name: APP Connect
Version: 2022
We are doing a complex query the flow was deployed in an ACE Integration server container running on OpenShift Cluster. in the flow The query runs multiple times, each time it retrieves 50 database records and put them as a batch into a queue. What we noticed was that the memory utilization had been going up gradually and eventual they pod run out of memory. As a result, the container in the pod was restarted by K8s.
We did not find any issue in the flow that could cause the memory from going up consistently. Is there any way we can profile why the memory is going up? Is there any known issue with integration server for memory leak?
By the way flow is working fine in ACE local IS.
TOP Command on POD:
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 11m 1542Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 7m 1601Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 6m 1651Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 11m 1659Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 5m 1664Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 8m 1697Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 8m 1697Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 14m 1720Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 14m 1720Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 9m 1783Mi
PS D:\Users\AROY3\Desktop\Horizon\oc\client> ./oc adm top pod | Select-String 'publish'
is-01-publisher-is-8467c9b4b7-8vklv 11m 1879Mi
Pod log:
2023-08-07T18:10:40.678Z Metrics: Connected
2023-08-07T19:19:00.367Z Metrics: Using CA Certificate folder /home/aceuser/adminssl
ERROR: StderrLogging has stopped
ERROR: StdoutLogging has stopped
ERROR: bgCmd.ReturnCode: 0
ERROR: bgCmd.ReturnCode: 0
2023-08-07T19:19:00.367Z Metrics: Adding Certificate /home/aceuser/adminssl/ca.crt.pem to CA pool
2023-08-07T19:19:00.368Z Metrics: Adding Certificate /home/aceuser/adminssl/tls.crt.pem to CA pool
2023-08-07T19:19:00.370Z Metrics: Using provided cert and key for mutual auth
2023-08-07T19:19:00.370Z Metrics: ACE_ADMIN_SERVER_NAME is is-01-publisher
2023-08-07T19:19:00.370Z Metrics: Connecting to wss://localhost:7600/ for statistics gathering
2023-08-07T19:19:00.370Z Metrics: Using provided webusers/admin-users.txt, connecting with user authentication
2023-08-07T19:19:00.371Z Metrics: Error retrieving session: Get "https://localhost:7600/": dial tcp [::1]:7600: connect: connection refused
2023-08-07T19:19:00.371Z Metrics: Connecting to wss://localhost:7600/ with SSL
2023-08-07T19:19:00.371Z Metrics: Error calling ace admin server webservice endpoint dial tcp [::1]:7600: connect: connection refused
2023-08-07T19:19:00.371Z Metrics: If this repeats then check you have assigned enough memory to your Pod and you aren't running out of memory
2023-08-07T19:19:00.371Z Metrics: Sleeping for 5 seconds before retrying to connect to metrics...
2023-08-07T19:19:05.371Z Metrics: Using CA Certificate folder /home/aceuser/adminssl
2023-08-07T19:19:05.372Z Metrics: Adding Certificate /home/aceuser/adminssl/ca.crt.pem to CA pool
2023-08-07T19:19:05.372Z Metrics: Adding Certificate /home/aceuser/adminssl/tls.crt.pem to CA pool
2023-08-07T19:19:05.373Z Metrics: Using provided cert and key for mutual auth
2023-08-07T19:19:05.373Z Metrics: ACE_ADMIN_SERVER_NAME is is-01-publisher
2023-08-07T19:19:05.373Z Metrics: Connecting to wss://localhost:7600/ for statistics gathering
2023-08-07T19:19:05.373Z Metrics: Using provided webusers/admin-users.txt, connecting with user authentication
2023-08-07T19:19:05.374Z Metrics: Error retrieving session: Get "https://localhost:7600/": dial tcp [::1]:7600: connect: connection refused
2023-08-07T19:19:05.374Z Metrics: Connecting to wss://localhost:7600/ with SSL
2023-08-07T19:19:05.374Z Metrics: Error calling ace admin server webservice endpoint dial tcp [::1]:7600: connect: connection refused
2023-08-07T19:19:05.374Z Metrics: If this repeats then check you have assigned enough memory to your Pod and you aren't running out of memory
2023-08-07T19:19:05.374Z Metrics: Sleeping for 5 seconds before retrying to connect to metrics...
2023-08-07T19:19:10.374Z Metrics: Using CA Certificate folder /home/aceuser/adminssl
2023-08-07T19:19:10.374Z Metrics: Adding Certificate /home/aceuser/adminssl/ca.crt.pem to CA pool
2023-08-07T19:19:10.375Z Metrics: Adding Certificate /home/aceuser/adminssl/tls.crt.pem to CA pool
2023-08-07T19:19:10.376Z Metrics: Using provided cert and key for mutual auth
2023-08-07T19:19:10.376Z Metrics: ACE_ADMIN_SERVER_NAME is is-01-publisher
2023-08-07T19:19:10.376Z Metrics: Connecting to wss://localhost:7600/ for statistics gathering
2023-08-07T19:19:10.376Z Metrics: Using provided webusers/admin-users.txt, connecting with user authentication
2023-08-07T19:19:10.376Z Metrics: Error retrieving session: Get "https://localhost:7600/": dial tcp [::1]:7600: connect: connection refused
2023-08-07T19:19:10.376Z Metrics: Connecting to wss://localhost:7600/ with SSL
2023-08-07T19:19:10.377Z Metrics: Error calling ace admin server webservice endpoint dial tcp [::1]:7600: connect: connection refused
2023-08-07T19:19:10.377Z Metrics: If this repeats then check you have assigned enough memory to your Pod and you aren't running out of memory
2023-08-07T19:19:10.377Z Metrics: Sleeping for 5 seconds before retrying to connect to metrics...
2023-08-07T19:19:14.805Z Signal received: terminated
2023-08-07T19:19:14.805Z Stopping metrics gathering
2023-08-07T19:19:14.805Z Stopping Integration Server
2023-08-07T19:19:14.805Z Integration Server stopped
2023-08-07T19:19:14.805Z Contents of log directory
2023-08-07T19:19:14.808Z total 12
-rw-rw----. 1 1001350000 1001350000 83 Aug 7 18:10 integration_server.is-01-publisher.exceptionLog.txt
-rw-rw----. 1 1001350000 1001350000 355 Aug 7 18:10 integration_server.is-01-publisher.trace.0.txt
-rw-rw----. 1 1001350000 1001350000 355 Aug 7 18:10 integration_server.is-01-publisher.userTrace.0.txt
2023-08-07T19:19:14.808Z If you want to stop the container shutting down to enable retrieval of these files please set the environment variable "MQSI_PREVENT_CONTAINER_SHUTDOWN=true"
2023-08-07T19:19:14.808Z If you are running under kubernetes you will also need to disable the livenessProbe
2023-08-07T19:19:14.808Z Log checking complete
2023-08-07T19:19:14.808Z Commands API server stopped
2023-08-07T19:19:14.808Z Integration commands API stopped
2023-08-07T19:19:14.808Z Shutdown complete
PS D:\Users\AROY3\Desktop\Horizon\oc\client>
------------------------------
Atanu Roy
------------------------------