Informix

 View Only
Expand all | Collapse all

Reviving of an old RFE that should have more votes: 16 millions data pages limitation

  • 1.  Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Mon October 07, 2019 09:42 AM
    Hi All

    in a current discussion, we have been talking about a RFE that many of us would like to be resolved.

    This deals about the 16 millions of data pages in a single fragment limitation. I cannot believe that only 6 people want this limitation to be unlocked.
    https://ibm-data-and-ai.ideas.aha.io/ideas/INFX-I-31

    If you feel concerned by this RFE, please proceed to this  URL and vote (click on the votes number)

    Eric



    ------------------------------
    [eric] [Verceletto] []
    [Founder]
    [kandooerp.org]
    [Pont l'Abbé] [France]
    [+33 626 52 50 68]
    ------------------------------

    #Informix


  • 2.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Mon October 14, 2019 07:31 PM
    That limit is probably hard to lift. It would probably require page format changes and that's risky.
    16M pages is at least 32GB. With bigger page sizes that can be 64 or 128GB. One probably don't want bigger fragments than that.
    I wouldn't vote against it, but there are lot's of more important things to implement in my view...

    Regards.

    ------------------------------
    FERNANDO NUNES
    ------------------------------



  • 3.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Fri October 25, 2019 06:31 AM
    Hi Eric,

    I agree that this limit can be a pain and have voted for the RFE. I did discuss this with a member of the Informix development team at IIUG 2018 and it is something they have given thought to. There are some recent features which make manipulating tables easier.

    Given that hitting this could be a system-down situation, it would be good if InformixHQ checked this limit out of the box. (I haven't used the latest HQ so apologies if this is a new feature).

    For a new database interval partitioning gives you a way of avoiding this (if you have Enterprise edition). Given the constraints or complications imposed by the 255 rows per page limit, variable length columns or compression, there are a few factors to think about when designing a storage schema. Perhaps another RFE would be to request a tool to help with this and avoid such problems in the first place?

    Ben.

    ------------------------------
    Benjamin Thompson
    ------------------------------



  • 4.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Fri October 25, 2019 06:57 AM
    Hi Ben,

    you are right, the solution is easy when you have entreprise edition, but much less easy when you don't have enterprise :-)

    So yes, having an out of the box HQ event to warn or alert on a timely manner when getting nearby the limit is a very good suggestion, but what after we are informed if you don't have enterprise ? The only possible consequence will be a downtime OR a downtime ...

    I agree that increasing this limit is kind of a very sensitive modification but ...



    ------------------------------
    [eric] [Verceletto] []
    [Founder]
    [kandooerp.org]
    [Pont l'Abbé] [France]
    [+33 626 52 50 68]
    ------------------------------



  • 5.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Thu November 07, 2019 09:59 AM
    Hi Eric.

    I've just rejoined the community and found something familiar about your question.

    I had to persuade a client to allow me to move some of their more monstrous tables to DBspaces with 16K pages.  And then I set up a fragmentation scheme, so the 16-million page limit became quite the non-issue.  It seems to me that once you have a fragmentation scheme you can keep adding fragments to the table as your space fills up.  If I were one of the developers I would likely not put high priority on this, mainly owing to the effort but also because the relatively easy workaround.  Until you run afoul on a limit to the number of partitions to a table.  I think that's a signed 16-bit number; ~32,000+ partitions.

    As to the effort:
    To reiterate something I think everyone on this forum already knows: The rowid of a row is currently 24-bit logical page offset within a partition (or call it fragment) with the remaining 8-bits being the slot number.  If I recall Dave Kosenko's class correctly, this has been the case since Informix-OnLine 4.0.  This is *SUCH* an entrenched assumption in all the code; I don't see how they could comb the code for where to change that rowid.  Yeah, with 20/20 hindsight (with which we are all gifted), we could have made the rowid a 64-bit combination of:
    - Partition number (Assuming 16 bits)
    - Logical page offset within partition (and let's raise that to 32 bits, unsigned)
    - Slot number within the page (and let's raise that to 16 bits)

    It would also eliminate the need for the bogus rowid used in "create table ... {partition options} WITH ROWID"
    But that's hindsight for you.

    That's my $0.02 (US) on the matter.

    -- Jacob S.

    ------------------------------
    Jacob Salomon
    ------------------------------



  • 6.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Thu November 07, 2019 10:07 AM

    Agreed... This is not really a big issue for medium to large shops with DBAs.
    The "hype" around this is really because it's a huge problem if it "attacks" you silently...
    Solving it probably requires significant downtime and can be hard and slow...

    Nevertheless, along the line there were some important changes in the page format... If another one happens in the future I would drag this issue into the picture... Otherwise I would not give a very high priority to it... Assuming this will cause issues from time to time in some customers... Mostly the ones that trust Informix is invisible...

    I have a long list of more urgent / important  and most important easier things to implement...

    Regards.



    ------------------------------
    FERNANDO NUNES
    ------------------------------



  • 7.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    IBM Champion
    Posted Thu November 07, 2019 10:09 AM
    Welcome back into the fold Jacob!

    Please note that I have discovered that there is actually an undocumented rowid even on partitioned tables called ifx_row_id (and on non-partitioned tables as well). See my BLOG entry from yesterday for details:


    It is not drop in compatible with the normal ROWID on a nonpartitioned table or the WITH ROWID column, but it is available and it does work and it is fast!

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 8.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Thu November 07, 2019 11:03 AM

    If I'm not mistaken, this was created when VERCOLS were introduced for updatable secondaries...

    I've found some discussions about this in comp.databases.informix dating 11 years ago (! :) )

    The discussion was about it using a sequential scan to find the row and there was an APAR around the same topic:

    http://www-01.ibm.com/support/docview.wss?uid=swg1IC94738

    But I suppose you tested it, as you say it's fast.

    But my problem with this... is that I think it's really a bad practice to use the rowid (or this) in applications. It opens the door for issues, created a dependency on something "internal", and I haven't yet seen a case where it is required.
    A cursor and "WHERE CURRENT OF" should avoid the use of ROWIDs (old or "new") and doesn't raise the same issues.
    The access by row id is only useful/fast if you already found the row and want to access it again... the CURRENT OF does the same...

    One interesting aspect of ifx_row_id is that it could possibly replace the use of ROWID in ISQL... Just had a customer where a table was fragmented (without ROWIDS) which made ISQL fail against that table... If an RFE for ISQL is a viable thing is another matter, but is worth trying...

    Regards.



    ------------------------------
    FERNANDO NUNES
    ------------------------------



  • 9.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    IBM Champion
    Posted Thu November 07, 2019 11:25 AM
    Fernando:

    I agree that using ROWID in applications in general is a bad idea and certainly it cannot be used as some kind of foreign key. That said, there are applications where ROWID/ifx_row_id is the best solution. For example, I just finished updating my dbdelete utility (and I will do my dbmove utility next) to use ifx_row_id so that it will work with partitioned tables that do not have WITH ROWID. 

    Those utilities fetch 8192 ROWIDs or ifx_row_ids and use them to build a DELETE or INSERT INTO ... SELECT ... statement with large IN() clauses to delete/copy data VERY quickly. This is MUCH faster than deleting or copying one row at a time with WHERE CURRENT OF.

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 10.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Thu November 07, 2019 06:19 PM

    "Those utilities fetch 8192 ROWIDs or ifx_row_ids and use them to build a DELETE or INSERT INTO ... SELECT ... statement with large IN() clauses to delete/copy data VERY quickly. This is MUCH faster than deleting or copying one row at a time with WHERE CURRENT OF."

    If we had more time this would be a very interesting topic.

    I'm almost sure that the reason why it works better is because you avoid the client/server interactions (just like table tennis, where the ball spends much more time over the air then in contact with the racquets).

    Now... From what I remember,  CURRENT OF can be used in a stored procedure ... :)

    I still think there is no need for using any form of rowid :)

    Regards.



    ------------------------------
    FERNANDO NUNES
    ------------------------------



  • 11.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Thu November 07, 2019 06:02 PM
    Hi Art.

    I am gratified that there is an alternative to a genuine rowid column in partitioned tables.  It is a good thing to know and it is a 64 bit number, better than the layout I was thinking of.  However...

    We have drifted away from the original source of this discussion - Eric's plaint of chafing at the 16-million page limit for a partition.  I see that Ben has suggested a timely warning.  I implemented just such a nightly job at Barnes & Noble as well as at Onyx Centersource (nee Onyx Business Intelligence) in Dallas.  I think I had set an arbitrary threshold of about 15 million pages and any partition exceeding that threshold would be included an an email to the admins.  Another threshold (I forget how high) would generate a pager alert, as would a log message about no more pages.

    Of course, my repeated requests to to the operations folks - to give us a heads-up when planning a massive load - went unheeded.  But I could tell them the exact time their load barfed.  But that's another war story altogether.  (Harrumph! Not the kind you tell your grandchildren!  :-)

    Back to the diversion about ifx_row_id: I see from your blobspot entry that the numbers on either side of the : are in decimal.  There is an advantage to using hex for the rowid component, like when you need to locate actual corruption and want to locate the logical page number.  But that is yet another story. :-)

    'Bye now.

    ------------------------------
    Jacob Salomon
    ------------------------------



  • 12.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    IBM Champion
    Posted Thu November 07, 2019 06:22 PM
    Yes, ifx_row_id is a VARCHAR with two numeric components separated by a colon. The left value is the partnum of the partition in which the row resides and the right value is the traditional ROWID within that partition.

    So, if you ever wanted an efficient way to find all rows with a particular filter value on a column other than the partition column that reside in a particular partition, tada!

    select * from sometable where filt_col = 34 and ifx_row_id matches partnum::VARCHAR||':*';

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 13.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Thu November 07, 2019 06:23 PM

    Regarding the timely warning, in the following link you can find a "framework" for scheduler tasks.
    One of them is just for that (monit_check_numpages.sql):

    https://github.com/domusonline/InformixScripts/tree/master/monit

    Regards.



    ------------------------------
    FERNANDO NUNES
    ------------------------------



  • 14.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Fri November 08, 2019 09:56 AM
    Hi!

    We also have sometimes troubles with "no more pages" (and it´s hard to monitor).
    And i´m still at a long discussion with support: How can we really monitor this issue.

    In monit_check_numpages.sql, you use sysmaster:sysptnhdr.npused as threshold.
    Does this really work for you?

    My last info was, that this column isn´t correct, when you delete rows (and troubles with binary columns).

    Example:
    - Table with 47 Mio rows -> No more pages
       -> npused=16777215 (=nptotal)

    - Then i deleted 7Mio rows
       -> npused is still at 16777215.

    Have you found a solution for that?

    (Repack/Shrink isn´t really a solution -> Performance-Troubles).

    ------------------------------
    Stefan Wilfling
    ------------------------------



  • 15.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Fri November 08, 2019 10:17 AM

    When you reach the 16M "used pages" you may still work because not all the pages are full and the engine may found some slots to INSERT new rows.

    But the best practice would be to trigger the alarm long before you reach the 16M used pages...

    If you want to be conservative you can set it to 10M for example. The purpose is go give you plenty of time to prepare the solution when you start risking having a problem, not to warn you when you're about to receive an error as it will not grant you enough time to really avoid it.

    As such, yes, the task has been working for a few customers, in the sense that they never faced the error again. One has just took the opportunity to rebuild one table during a planned downtime for upgrade. If you avoid the error it means it works for you. If you're concern that it might warn you too soon, or in situations where eventually you would never hit the problem, then yes, depending on the threshold you use, it can happen. But that infinitely better than hitting the error :) And the threshold may be set closer or further from the real limit, depending on your table's growth rate, the impact that facing the error would have etc...

    In summary, the idea of the task was never that it would warn you when you've exhausted the free slots. The engine already tells you that by blowing each INSERT with an error. The idea is to warn you and give you time to plan the rebuild of the table.

    Regarding the shrink... it can potentially free pages. But that's not the main concern here. And what problems did you face with it? The one I recall is that I don't like it "online". And for "offline" you can just rebuild the table...

    Regards.



    ------------------------------
    FERNANDO NUNES
    ------------------------------



  • 16.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Fri November 08, 2019 10:25 AM

    The check against 10M is exactly how we have been approaching this with the monitoring query below and it has been working well for us and providing enough time to get ready for action (but I would rather see this limit lifted/increased):

     

    select dbsname, tabname,ti_nptotal pages
      from systabinfo,systabnames
      where ti_partnum = partnum
      and dbsname not matches 'sys*'
      and ti_nptotal > 10000000;

    Hal Maner

    M Systems International, Inc.

    www.msystemsintl.com

     

     






  • 17.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Mon November 11, 2019 05:22 AM
    Hi Stefan,

    my tests cannot confirm that npused = nptotal. My findings:
    nptotal = number of allocated pages
    npused = number of used pages
    npdata = number of pages with data

    For example:
    dbsname testdb
    tabname testtable
    ti_npused 93028
    ti_npdata 90185
    ti_nptotal 101552

    After truncating the table:
    dbsname testdb
    tabname testtable
    ti_npused 1
    ti_npdata 0
    ti_nptotal 101552

    Deleting rows means you can reuse the allocated pages. The number of allocated pages doesn't change.

    I suggest to monitor ti_nptotal if you want to feel safe. A growing table in mind ti_nptotal is always a bit bigger than ti_npdata. The disadvantage is that after deleting rows you will be still alerted. 

    So if you monitor ti_npdata there is a risk if the allocated pages grows near the 16-million pages limit. If you set your threshold to - let say - 14.000.000 pages the risk is minimal. But if you delete rows or truncate a table you will not be alerted any more until your threshold is reached again.

    Regards, Reinard.

    ------------------------------
    Reinhard Habichtsberg
    ------------------------------



  • 18.  RE: Reviving of an old RFE that should have more votes: 16 millions data pages limitation

    Posted Fri November 08, 2019 09:55 AM
    Hi Renaud,
     
    If after removing the shared memory segments and it crashes on next oninit probably indicates a configuration issue.
    Post your onconfig and sqlhosts files
     
    Regards
    Steven