Informix

 View Only
Expand all | Collapse all

Large memory segment vs smaller ones

  • 1.  Large memory segment vs smaller ones

    Posted Thu August 08, 2024 05:25 AM

    Hi,

    IBM Informix Dynamic Server Version 11.70.FC5XE

    What is better, one large memory segment or several smaller ones?

    For now, we have the following,

    Segment Summary:
    id         key        addr             size             ovhd     class blkused  blkfree
    87031833   52564801   700000010000000  68719476736      805740152 R     16371610 405606
    90177556   52564802   700001010000000  68719476736      805308408 V     16777088 128
    102760471  52564803   700002010000000  68719050752      805303416 V     16777112 0
    70254606   52564804   700003010000000  20971520000      245761992 V     5119996  4
    103809046  52564805   700003500000000  20480000000      240001992 V     4999698  302
    65011727   52564806   7000039d0000000  20971520000      245761992 V     5119290  710
    73400329   52564807   700003ec0000000  20971520000      245761992 V     5118874  1126
    87031821   52564808   7000043b0000000  20971520000      245761992 V     5117879  2121
    82837521   52564809   7000048a0000000  20971520000      245761992 V     1693440  3426560
    Total:     -          -                331495604224     -        -     77094987 3836557

    And we are still suffering regular rsam memory block header corruptions, which result in the server termination eventually.

    Does it have sense to play with SHMVIRTSIZE / SHMADD?

    SHMVIRTSIZE             20000000
    SHMADD                  20480000
    EXTSHMADD               8192
    SHMTOTAL                0
    SHMVIRT_ALLOCSEG        0,3



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------


  • 2.  RE: Large memory segment vs smaller ones

    Posted Thu August 08, 2024 06:13 AM

    Dennis:

    One segment is better than many, so whenever you see several virtual segments it is best to increase SHMVIRTSIZE to incorporate all of that into a single initial segment unless you know that the increase was caused by an unusual event such as a large session that runs say once a quarter.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 3.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 03:07 AM

    Hi Dennis.

    Also, have you allocated huge pages on Linux or enabled large pages on AIX? That might help with your rsam memory block header corruptions.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 4.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 03:20 AM

    Doug,

    Could you please give more details on AIX's large pages?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 5.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 03:44 AM

    https://www.ibm.com/docs/en/informix-servers/14.10?topic=products-ifx-large-pages-environment-variable

    https://www.ibm.com/docs/en/aix/7.3?topic=performance-large-pages

    You will need RESIDENT 2 or -1 as large pages are pinned.

    There is another approach on AIX (not as effective but easier):

    export LDR_CNTRL=DATAPSIZE=64K@TEXTPSIZE=64K@STACKPSIZE=64K@SHMPSIZE=64K

    The default is only 4K.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 6.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 04:49 AM

    Doug,

    Thank you so much for the piece of advice.

    For clearance, I need to export LDR_CNTRL before oninit? And that is instead of setting up large pages?

    Next, what size of memory considered large?

    When IFX_LARGE_PAGES is enabled, the use of large pages can offer significant performance benefits in large memory configurations.

    Is 610GB large?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 7.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 05:01 AM

    Yes - just export LDR_CNTRL when starting "oninit" so it doesn't affect anything else. That predates large page support in Informix on AIX. We have that on all the AIX systems we support. None of those have as much memory as yours, so I would think large pages (Informix will use 16MB) must be a good thing for you. We use the equivalent "huge pages" on all Linux systems regardless of size.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 8.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 05:49 AM

    Doug,

    Thank you!

    One more question, if you don't mind. Which metrics, from OS or IDS, might indicate that we have a positive impact on performance?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 9.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 06:09 AM

    The main objective here is to improved stability by reducing your memory block header corruptions. Others on this forum will no doubt have views on measuring performance impact, but the obvious one would be CPU usage.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 10.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 08:30 AM

    Doug,

    How to make sure that LDR_CNTRL has been taken into action?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 11.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 08:40 AM

    You can check the environment of a process with "ps eww":

    https://www.ibm.com/support/pages/verification-environment-settings-running-process-under-aix

    Not sure otherwise how you prove it's effective.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 12.  RE: Large memory segment vs smaller ones

    Posted Tue August 13, 2024 05:34 AM

    Doug,

    I found a way to prove if LDR_CNTRL is effective.

    svmon -S -O filtercat=shared,shmid=on,mpss=on

    Without the setting it shows s(mall, 4K) and m(edium, 64K) pages.

    After restart small pages all gone.



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 13.  RE: Large memory segment vs smaller ones

    Posted Tue August 13, 2024 05:58 AM

    Thanks, Dennis - that's good to know, and also works for me.

    Have your memory header corruptions reduced, or is it too early to say?



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 14.  RE: Large memory segment vs smaller ones

    Posted Tue August 13, 2024 06:24 AM
    Edited by Dennis Melnikov Tue August 13, 2024 06:26 AM

    Doug,

    For now, I've just restarted our test server that doesn't suffer the corruptions--just to test the application.

    It proved to get into action, so the next step is restarting our prod server soon.

    After that, I will configure 16M pages on the test, and then--on the prod.



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 15.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 09:38 AM

    Doug,

    Do I need any changes in ONCONFIG before running `oninit` with LDR_CNTRL?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 16.  RE: Large memory segment vs smaller ones

    Posted Fri August 09, 2024 09:44 AM

    No ONCONFIG changes are required.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 17.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 05:13 AM

    Unfortunately, LDR_CNTRL with 64K pages did not bring our prod server desired stability, so now I'm going to move forward to 16M pages.

    1. Do I need setting v_pinshm=1 for Informix?
    2. Provided AIX doesn't page out large pages, why still use RESIDENT to lock segments in memory?
    3. We are suffering RSAM pool memory buffer header corruptions. As buffer-header table resides in resident segment of shared memory, does it make sense to allocate large pages for that segment only and thus set RESIDENT=1?
    4. While we have Buffer Turnover Rate 5.75/hr, should I enlarge BUFFERPOOL to lower the BTR and buffer-header table access rate as well?
    5. BTW, how to lower Bufwaits Ratio for 16K cache?

    BUFFERPOOL      default,buffers=3500000,lrus=500,lru_min_dirty=0.20,lru_max_dirty=0.50
    BUFFERPOOL      size=4K,buffers=45000000,lrus=500,lru_min_dirty=0.06,lru_max_dirty=0.14
    BUFFERPOOL      size=16K,buffers=625000,lrus=500,lru_min_dirty=2.00,lru_max_dirty=5.00

    Art Kagel's newratios.ksh output:

    Metric Ratio Report For 4K Cache
          Bufwaits Ratio:             1.120000%
            Buffer Turnover Rate:         5.81/hr
            Used Buffer Turnover Rate:    5.81/hr
            Experimental BTR #2:          1.04/hr
            Experimental BTR #3:          1.86/hr

    Metric Ratio Report For 16K Cache
            Bufwaits Ratio:             7.350000%
            Buffer Turnover Rate:         1.63/hr
            Used Buffer Turnover Rate:    1.63/hr
            Experimental BTR #2:          1.08/hr
            Experimental BTR #3:          1.90/hr

    Metric Ratio Report Summary For All Caches
            ReadAhead Utilization:      43.200000%
            Bufwaits Ratio:             4.030000%
            Buffer Turnover Rate:         5.75/hr
            Used Buffer Turnover Rate:    5.75/hr
            Experimental BTR #2:          1.04/hr
            Experimental BTR #3:          1.86/hr
            Lock Wait Ratio:            0.00000%
            Sequential Scan Ratio:      1.24000%



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 18.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 06:16 AM

    Hi Dennis.

    1. Yes: "vmo -p -o v_pinshm=1".
    2. You must have RESIDENT 2 or more for large pages as they must be pinned.
    3. The "rsam" pool is in the dynamic segment. You need RESIDENT 2 or more.
    4. It doesn't sound like the problem is with the buffer pool but the "rsam" pool. Your BTR is fine anyway.
    5. I will leave Art to answer that!

    I wasn't sure whether large pages were supported by IDS 11.70 on AIX, but this old forum thread implies they are:
    http://old.iiug.org/forums/ids/index.cgi/noframes/read/26661

    Can you share an example assert file?



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 19.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 06:43 AM

    Doug,

    Do you need the whole 1G AF file?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 20.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 06:51 AM

    Probably! Will message you privately for that.



    ------------------------------
    Doug Lawry
    Oninit Consulting
    ------------------------------



  • 21.  RE: Large memory segment vs smaller ones

    Posted Mon August 19, 2024 05:02 AM
    Edited by Dennis Melnikov Mon August 19, 2024 08:35 AM

    Thanks Doug that brought my attention to these details.

    `onstat -g mem` shows two "ovrfl-buff0" the biggest segments.

    name         class addr             totalsize        freesize         #allocfrag #freefrag
    ovrfl-buff0  V     700001040001040  67913740288      4864             2          2
    ovrfl-buff0  V     700001040002040  63615004672      4864             2          2
    res-buff0    R     7000003625de040  52791328768      4864             2          2
    resident     R     70000004006a040  13461045248      4864             2          2
    ovrfl-buff3  V     700001f0fc01040  10240024576      4864             2          2
    mt           V     700001f0fc25a58  1072594944       8056
    rsam         V     700001f0fe3a040  965901864        12394168         959051     4448
    global       V     700001f0fc02040  798392320        2968104          3211921    5160
    mt           V     700001f0fc23040  634150912        21905768         3244389    211923
    aio          V     700001f18b2c040  219742208        1103752          18281      3014

    What are those segments?

    How can 67,913,740,288 bytes of the 1st ovrfl-buff0 take place between addresses 700001040001040 and 700001040002040?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 22.  RE: Large memory segment vs smaller ones

    Posted Mon August 19, 2024 08:31 AM

    Dennis -

    Have you configured your BUFFERPOOLs to extend automatically, i.e. are you using the "extendable" keyword in the BUFFERPOOL parameter?  I am not sure if this was available in 11.70.

    Do you have dbspaces defined with page sizes for which there is no BUFFERPOOL entry and you are using the "default" bufferpool?



    ------------------------------
    Mike Walker
    xDB Systems, Inc
    www.xdbsystems.com
    ------------------------------



  • 23.  RE: Large memory segment vs smaller ones

    Posted Mon August 19, 2024 08:54 AM

    Mike,

    BUFFERPOOL      default,buffers=3500000,lrus=500,lru_min_dirty=0.20,lru_max_dirty=0.50
    BUFFERPOOL      size=4K,buffers=45000000,lrus=500,lru_min_dirty=0.06,lru_max_dirty=0.14
    BUFFERPOOL      size=16K,buffers=625000,lrus=500,lru_min_dirty=2.00,lru_max_dirty=5.00

    All dbspaces are 4K or 16K.



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 24.  RE: Large memory segment vs smaller ones

    Posted Mon August 19, 2024 12:37 PM

    I believe that the largest memory segment for Informix on AIX is 64 GB.  In 11.70, we would normally expect the bufferpools to be created in the resident segment.  Based on your first post showing the memory segments, it looks like the "resident" segment has been maxed out, and then the first 2 "virtual" segments are to accommodate the remainder of the 190 GB for your bufferpools.  I expect that these are the ovrfl-buff0 memory segments that you see.



    ------------------------------
    Mike Walker
    xDB Systems, Inc
    www.xdbsystems.com
    ------------------------------



  • 25.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 06:51 AM

    BTW, between 'large' assertions--I mean those that are logged in online.log--the server throws 'small' ones, i.e. files of 700+ bytes size, that appear in DUMPDIR only, like this one:

    12:10:06  Found during mt_shm_free 1
    12:10:06  Pool 'rsam' (0x700001f0fe3a040)
    12:10:06  Bad block header 0x700003c9c048ea8
    blk-64
    0700003c9c048e68: 0e000079 38091cf1 0010197c 00000e0e   ...y8... ...|....
    0700003c9c048e78: 0700003c 9c048ea8 0700003c 9c048e08   ...<.... ...<....
    0700003c9c048e88: 00000000 00000040 00000000 0001a2af   .......@ ........
    0700003c9c048e98: 00000150 00000008 00000000 00000000   ...P.... ........
    blk+64
    0700003c9c048ea8: 0e000079 28011c91 0010197c 00000e0e   ...y(... ...|....
    0700003c9c048eb8: 0700003c 9c048ee8 0700003c 9c048e68   ...<.... ...<...h
    0700003c9c048ec8: 00000000 00000040 00000000 00001e84   .......@ ........
    0700003c9c048ed8: 00000151 00000008 00000000 00000000   ...Q.... ........



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 26.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 07:10 AM

    Hello

    an Assert Failed are often caused by an internal corruption in the shared memory.
    Caused by some action (e.g. a select, a need of more shared memory, a cleanup of something, a sort ....).
    Some time later some other session is hitting the memory corruption and throwing the Assert Failed.

    What is the normal way out: move to a newer version and see if it is still happening. As a lot of memory corruptions have been fixed over the years.
    What you can do too:
    1) set DIAG_AFWARN=on in $INFORMIXDIR/etc/evidence.sh
    The you will get a "full" assert failed file for all of this Assert Failed.
    2) check onstat -g sql/onstat -g ses  in the AF file if always same or similar statement is throwing the AF.
    Then you can try if this reproduces with the statement.  When yes you can try to change the statement.

    Please check if DUMPSHMEM = 0 is set.
    otherwise you will get a lot of dumps which are only filling your file system



    ------------------------------
    Hedwig Fuchs
    ------------------------------



  • 27.  RE: Large memory segment vs smaller ones

    Posted Fri August 16, 2024 01:28 PM
    Dennis:

    What are your settings for AUTO_TUNE and AUTO_LRU_TUNING?

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 28.  RE: Large memory segment vs smaller ones

    Posted Mon August 19, 2024 04:11 AM

    Art,

    AUTO_LRU_TUNING 1

    No AUTO_TUNE parameter in 11.70.



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 29.  RE: Large memory segment vs smaller ones

    Posted Tue September 10, 2024 07:40 AM
    Oops, I found this in my Drafts. Thought I had sent it out already.

    Dennis:

    There are two ways to improve bufwaits. Remember that bufwaits represent contention for either individual buffers or, far more commonly, for the lru queues themselves. Given that the most important thing is to have enough LRU queues to prevent many sessions from waiting on a single queue. Sometimes, even though the BTR3 is low, as it is in your case, more buffers will reduce the contention. That is worth a try, especially since you seem to have nearly maxed out the number of LRU queues at 500. You could try going up to 512, I know that's a tiny change, but your BR is only 7.35 which is not terrible and this might just tick you over into the green zone.

    If that does not work, and increasing buffers a bit also does not help (it won't if the contention is for a small number of buffers, you could try tweaking a couple of undocumented parameters that control how LRU selection works. Quick tutorial on that: 

    LRU selection comes into play when a page has to be added to the buffer cache.
    When a clean page has to be moved to the dirty side when first modified (or after having been cleaned), or to move a page from the least recently accessed end of the queue to the most recently accessed end the queue is known. A latch is still needed so a bufwait is still possible, but these parameters will not ameliorate these waits. 
    When a session needs to add a page to the cache, it randomly selects a queue by hashing with its session id and attempts to acquire that lru's latch. That means that if there are many sessions several might hash to the same queue and most will have to wait (ie bufwait). In addition, those other sessions needing to move buffers around the queue will also line up to wait for a latch.
    A waiting session will spin on the latch several thousand times, sleep for a bit, try again, then timeout and rehash to another lru. But remember that all those other sessions will be doing the same thing so many will rehash to the same "other" lru along with timed out sessions from other lrus.

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 30.  RE: Large memory segment vs smaller ones

    Posted Wed September 11, 2024 05:33 AM

    Art,

    What are the two undocumented parameters to tweak?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 31.  RE: Large memory segment vs smaller ones

    Posted Wed September 11, 2024 06:10 AM

    Dennis:

    I pinged someone in development. He thinks that the one parameter I was thinking of (turns out it wasn't two) may no longer be active. He was going to look it up for me, but I have not heard back. I'll follow up, but they are swamped working to close v15.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 32.  RE: Large memory segment vs smaller ones

    Posted Wed December 04, 2024 07:31 AM
    Edited by Dennis Melnikov Thu December 05, 2024 04:12 AM

    How to perform a relevant test to evaluate benefits that large pages bring?



    ------------------------------
    Sincerely,
    Dennis
    ------------------------------



  • 33.  RE: Large memory segment vs smaller ones

    Posted Wed December 04, 2024 08:31 AM

    Using several smaller memory segments instead of one large segment is often better for stability, particularly in preventing memory block header corruption in IBM Informix. Adjusting parameters like SHMVIRTSIZE and SHMADD can help fine-tune memory allocation, but it's important to monitor the system after making changes to ensure improved performance and stability.



    ------------------------------
    james colin
    ------------------------------