IBM Data Management Community Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems. Join / Log in
My company recently completed a multi-year migration off old Informix 11.70.FC7 (RHEL 6) servers to new Informix 14.10.FC9 (RHEL 8) servers. Following the migration we are noticing slower performance of some app team 4GL scripts that run locally on the database server.
One specific 4GL script we've been testing contains cursors that cycle row by row through around 333,000 inserts and 18,000 deletes. It also has frequent writes to files on disk. As part of troubleshooting efforts we've ran SQL Traces against the specific user executing the 4GL on non-Prod environments to determine if Informix or other portions of the code are to blame. Comparing the results of the SQL Traces between the 11.70.FC7 and 14.10.FC9 servers, the data indicates each Insert and Delete involved is taking quite a bit longer in comparison on the new servers.
11.70 Server Average Runtimes:Deletes: .0002 secondsInserts: <.0001 seconds
14.10 Server Average Runtimes:Deletes: .0025 secondsInserts: .0023 secondsThis may not appear major, but it's contributing to runtimes 30+ minutes longer for this specific 4GL process than the old server. The SQL Trace findings led us to start looking into the disk and IO configurations.
Additional background info:
Both the new and old Informix Servers are VMs running in the same RHV cluster, meaning their using the same physical hardware and physical servers. Both VMs have 4 cores. Both use cooked disk and a Storage Pool. Both have Huge Pages enabled and sufficient RAM allocated to fit all memory segments. Both utilize KAIO, and have 4 or more additional supporting AIO threads. The KAIOON environment variables are set the same.
We've spent a considerable amount of time reviewing configurations at both the Informix and OS levels and haven't been able to make much additional progress. What are some potential OS and/or Informix settings/configs you recommend our DBAs and System Admins look into that may be a culprit here? If you'd like any additional information on our setup or Informix settings please let me know.
I am going to do something that I rarely do here. I am going to suggest that you contract with me to do a Server Health Check to figure out what's happening or at least to contract for a block of hours with me observing your system live to figure out what's going in and how to best address it because this does not seem to be an obvious issue that I can point to with no further details of observations.
Thanks for taking the time to review my post. We were able to determine latency from Replication was involved with the slower Inserts and Deletes. My response to Mike goes into further detail.
Are you using any replication?
Thanks for the reply and for reviewing my original post.
After additional testing yesterday afternoon we found Near_Sync Replication and a 2ms latency between the Primary and Secondary was causing the Insert and Delete queries to take longer.
Our old 11.70FC7 servers were all standalones. Our new 14.10FC9 servers include a non-Prod standalone, a non-Prod HDR environment, and a Prod HDR environment.
When SQL Tracing the non-Prod 14.10FC9 standalone instance yesterday afternoon we were able to see identical Insert and Delete speeds as the old 11.70FC7 servers, and that helped connect the dots that Replication latency was involved for the slower Inserts and Deletes on the other servers.
The app team's 4GL process is still running significantly slower on the new standalone server, but we know now Informix and Disk IO speeds are not responsible.
This discussion can be considered closed. :)
Chad - thanks for posting the update. I hope that you can sort out the latency issues, or perhaps consider RSS for the time being, esp. if the Primary and HDR are in separate locations. There are a few things to tweak with replication so hopefully you can overcome the latency problem.
You will get much better results if you commit only every few thousand rows. The following is example SPL/4GL pseudo-code:
LET nrows = 0;
FOREACH ... WITH HOLD ...
LET nrows = nrows + 1;
IF MOD(nrows) = 1000 THEN
(WITH HOLD cursors are not closed by COMMIT.)
It will then send full logical log buffers to the replica with far fewer acknowledgements needed. This will be quicker even without replication.