Informix

 View Only
Expand all | Collapse all

Table Extent Size calculation

  • 1.  Table Extent Size calculation

    Posted Tue November 17, 2020 02:37 AM
    I have to create a new table and need to calculate the extent size. If I don't have data growth forecasting , but customer has given the reference table and as per the reference table below are the data:

    Row size   133
    Number of rows - 52655123
    Number of column - 29


    Can this be used for creating new table or should I ask customer for growth forecasting ?
    And please let me know the extent size calculation formula.

    Thanks
    Amit Patel

    ------------------------------
    AMIT PATEL
    ------------------------------

    #Informix


  • 2.  RE: Table Extent Size calculation

    Posted Tue November 17, 2020 02:50 AM

    Hi Amit,

     

     

    Very important to to mess with is the first extent, because you cannot change this size. The size of the next extent is also important to avoid having a high number of extents over time, but you can modify it anytime: the new value will apply for the next newly created extent, so the impact of giving it a size is not so critical as it can be modified anytime.

     

    If your customer is insisting on not giving you a growth forecast, apply 10% of first extent size to the next size and monitor with sysmaster over time.

    If you see the table has 30 extents in one week, next size is way too low and needs to be increased.

    This query will help

     

    select dbsname,tabname,min(size) as min,sum(size) as sum,count(*) as count from sysmaster:sysextents where dbsname not matches "sys*"group by 1,2;

     

    Eric

     






  • 3.  RE: Table Extent Size calculation

    Posted Tue November 17, 2020 04:14 AM
    Hi Amit,

    You also need to consider:

    1) Page Size on which you will be creating a table 
    2) Table needs to be fragmented or non fragmented
    3) If you don't have data forecasting, then you can look at the reference table growth rate and take out the number of rows.



    ------------------------------
    Gaurav Kumar
    ------------------------------



  • 4.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 07:06 AM
    Hi Gaurav,

     Customer told Data growth forecasting will be 30000-80000 per month....
    So what should be the First and next extent size for table


    Thanks
    Amit Patel

    ------------------------------
    AMIT PATEL
    ------------------------------



  • 5.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 10:14 AM
    Hi Amit,

    While a formula has been provided for you, it might be worth pointing out that there is little benefit in spending loads of time on this, unless you use version 11.50 or earlier. Some of the advice given above would appear to relate to these older versions.

    In recent versions of Informix, 11.70 onwards I believe, the extent algorithm will ensure all new extents are at least 10% of the total table (or fragment) size. This will override the next extent size if it is smaller than this. The database engine will also merge any adjacent extents into a single extent. Therefore if you let the engine do its thing you should not have any issues with too many extents even if you start from the default of 16 kB.

    Having said this, if you know the table will be large then it makes sense to start with something bigger for the initial and next extent sizes. The engine will create larger extents than "next size" once next size is less than 10% of the table size and you can see this with 'oncheck -pe', however "dbschema -ss" will still show the extent sizes set by you.

    It's also worth noting that if you plan to use the table shrink feature in future that a table cannot be made smaller than its initial extent size.

    Ben.

    ------------------------------
    Benjamin Thompson
    ------------------------------



  • 6.  RE: Table Extent Size calculation

    IBM Champion
    Posted Wed November 18, 2020 10:31 AM
    Bejamin is correct in everything he says. However, there are two issues with just creating the table with the default extent sizing of 8K and letting it grow:
    1. If the dbspace into which you are placing it has its free space fragmented by dropped smaller objects, then the initial extent will not likely be contiguous with those created during the load resulting in a fragmented initial load.
    2. Growing the extents during the load will SEVERELY slow down the load versus pre-allocating the disk space by specifying the initial extent size. Similarly specifying the next size will make sure that additional extents are similarly preallocated to a reasonable size and so, if you size for a month's growth, the client's applications will only experience a slight delay to allocate a new extent once a month, rather than more frequently.
    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 7.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 10:40 AM

    And I would add: NEVER count on storage pool to get rid of this calculation (which is fun and a first-do exercise thing for IDS beginners).
    If you let the Storage Pool manage the growth of your table, you will probably end your first load with TONS of extents. This happened to one of my customer who was happy to be a bit lazy.
    Laziness pays sometimes, but not in all cases
    ��

     

    Eric Vercelletto
    Data Management Architect and Owner / Begooden IT Consulting
    Board of Directors, International Informix Users group
    IBM Champion 2013,2014,2015,2016,2017,2018,2019,2020
    ibm-champion-rgb-130px

    Tel:     +33(0) 298 51 3210
    Mob : +33(0)626 52 50 68
    skype: begooden-it
    Google Hangout: eric.vercelletto@begooden-it.com
    Email:
    eric.vercelletto@begooden-it.com
    www :
    http://www.vercelletto.com
    www  https://kandooerp.org

     

     






  • 8.  RE: Table Extent Size calculation

    IBM Champion
    Posted Wed November 18, 2020 11:17 AM
    True, Eric. If you want to depend on the storage pool to allocate new space to a dbspace to accommodate new extents, then make sure to set the extend size for the specific dbspace large enough that the whole extent will fit in it. Otherwise the storage pool will add multiple small chunks to the dbspace using the default extend size which is only 10MB. Since an extent cannot span multiple chunks that means extents smaller than 10000 KB if you don't adjust the extend size of the dbspace. Also adding an extent that requires a new chunk allocation, or even a chunk extension using extendable chunks, will be VERY SLOW so you will want to manually extend the chunk or add a new chunk (with or without storage pool configured) before starting a large bulk load.

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 9.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 11:11 AM
    In short I agree with both points you make and would size extents appropriately for such a load, but I am interested to know more detail about the severity of slow down you would expect to see.

    We did quite a bit of work with IBM support a while back around delays when allocating a new extent which resulted in fixes in late 11.70 releases and 12.10. The main issue we experienced was bitmap page scanning to find free pages when the extents are nearly but not actually full, particularly when many pages need to be scanned to find a free slot. We got down to analysing the number of rows inserted per second. This is only significant for large tables and once the engine decides a new extent is needed, such an extent can be added quickly with little/no discernable performance impact. I appreciate extent addition is not cost free: a mutex lock is needed on the whole dbspace to stop other tables adding an extent at the same time, for example. It's a complicated subject, however I can't see these issues making much difference to loading a table from empty.

    Back to the original poster's query, I think this illustrates why it's important to state what version of the product you are working with!

    Ben.

    ------------------------------
    Benjamin Thompson
    ------------------------------



  • 10.  RE: Table Extent Size calculation

    IBM Champion
    Posted Wed November 18, 2020 11:23 AM
    Benjamin:

    Finished a major migration for a client in January. During practice runs, for the first test run I forgot to adjust the extent sizing for a very large logical table (>2TB of data to each of 9 physical tables treated by the application as one - don't ask) the subsequent runs were more than an hour faster with initial extents set properly. Part of that was that towards the end of the first set of runs the dbspaces' chunks had to auto-extend, but that didn't take an hour or more.

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 11.  RE: Table Extent Size calculation

    IBM Champion
    Posted Tue November 17, 2020 01:46 PM
    Amit:

    To calculate how big to make the initial extent so that the entire new table will fit in a single extent at load time, this is the formula:

    EXTENT SIZE = (nrows * (rowsize + 2)) / (pagesize - 24)

    This will work if 1) the rowsize is smaller than pagesize (yours is for any pagesize), and 2) there are not "special" characters or variable length characters.

    As far as calculating the NEXT SIZE, for that you will need some idea of the table's expected growth rate and access patterns. Ideally you want queries to mostly deal with only one or two extents even if the table has hundreds of them. So, a table that typically is accessed by reports that only look at recent rows, can perform fine with a relatively small NEXT SIZE while a table that is typically queried for 12 month's worth of data will need NEXT SIZE set to hold about a year's worth of data. 

    As far as page size is concerned, you may have to deal with a server that is mandated to have only a single base size for all dbspaces, but if you have the flexibility to create dbspaces with different page sizes from the default (4K on Windows and AIX, 2K everywhere else) then you want to choose a pagesize that will minimize waste. So, for example, your rowsize is 133 bytes (to which you have to add 2 bytes for the slot entry for each row), it is possible for each page to waste as much as 134 bytes. The waste calculation is similar to the extent calculation above:

    waste_per_page = (pagesize - 24) - (((pagesize - 24) / (rowsize + 2)) * (rowsize + 2))
    waste_per_row = waste_per_page / ((pagesize - 24) / (rowsize + 2))

    basically it's:
    waste_per_page = (available space on a page) - ((number of rows on a page) * (space for a row))
    waste_per_row = waste_per_page / (number of rows on a page)

    waste per row is the more important number when comparing different pages sizes. Obviously if the choice is between wasting 10 bytes per row on a dbspace page size that exists and creating a new dbspace with a different pagesize to get it down to 8 bytes per row may not be worth the maintenance overhead and extra memory required to create a cache for a new pagesize, so, like everything else we do as DBAs, we work with tradeoffs and near-optimal choices. Enjoy!

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.









  • 12.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 07:39 AM
    Dear Art,

    Thanks for your reply.

     Customer told Data growth forecasting will be 30000-80000 per month....
    So what should be the First and next extent size for table

    EXTENT SIZE = (nrows * (rowsize + 2)) / (pagesize - 24)

    So How to fit the value to calculate the First and next Extent size in table. I'm new with these terms so confuse with few terms like PAGEUSE , REMSIZE etc.



    Thanks
    Amit Patel

    ------------------------------
    AMIT PATEL
    ------------------------------



  • 13.  RE: Table Extent Size calculation

    IBM Champion
    Posted Wed November 18, 2020 10:11 AM
    Amit:

    OK, so, one thing I forgot in the EXTENT SIZE formula. The formula as posted reports in pages, but the EXTENT SIZE and NEXT SIZE are in KB, so you will have to multiple the result by the number of KB per page in the pagesize of the dbspace you will place the table in. It also did not include the additional overhead pages needed which is one bitmap page for every 4096 data pages. The corrected formula (all fractional values except ROWS_PER_PAGE should be rounded up) is:

    ROWS_PER_PAGE = (pagesize - 24) / (rowsize + 2)
    DATA_PAGES = nrows / ROWS_PER_PAGE 
    BITMAP_PAGES = DATA_PAGES / 4096
    EXTENT SIZE = (DATA_PAGES + BITMAP_PAGES) * (pagesize / 1024)

    In your response you say: "I'm new with these terms so confuse with few terms like PAGEUSE , REMSIZE etc." 
    Honestly I don't know what is meant by PAGEUSE or REMSIZE either. Did someone else use those terms in another reply that I missed? I could guess, but ...

    Anyway, you ask specifically, that the table has 52,655,123 rows initially and will grow at a rate of 30,000 to 80,000 rows per month and that the row size is 133 bytes. So, IF you want to get the whole initial load of the table into a single extent, it would be the result of this calculation, assuming the default pagesize of 2K is used (I'll cover page sizes later, hang on):

    ROWS_PER_PAGE = (2048 - 24) / (133 + 2) = 2024 / 135 = 14.99 = 14 rows
    DATA_PAGES = 52655123 / 14 = 3761080.21428 = 3761081 pages
    BITMAP_PAGES = 3761081 / 4096 = 918.23266 = 918 pages
    EXTENT_SIZE = (DATA_PAGES + BITMAP_PAGES) * (KB per page) = (3761081 + 918) * (2048 / 1024) = 3761999 * 2 = 7523998 KB

    If your chunks are bigger than about 7.3GB and have that much contiguous free space available, then that will work in a single extent. If not, the engine will just allocate a few extents as possible as big as possible given the contiguous space it can find.

    As for the next extent, as I said, that in part depends on how the data is access and, for example, how many month's worth are actively being queried in general. Let's assume one month's data and assume the average between 30,000 and 80,000 row at 55,000 rows. Then the formula above yields:

    ROWS_PER_PAGE = (2048 - 24) / (133 + 2) = 2024 / 135 = 14.99 = 14 rows   -- Notice that nearly a full row or space is wasted!
    DATA_PAGES = 55000 / 14 = 3928.57142 = 3929 pages
    BITMAP_PAGES = 3929 / 4096 = 0.95922 = 1 page
    EXTENT_SIZE = (3929 + 1) * (2048 / 1024) = 3930 * 2 = 7860 KB

    That's an average single month. You can scale that easily to a maximum month or to six months or twelve months worth if you need to.

    So, you say you are "new to these terms", so I'm going to make the possibly incorrect assumption that you also are not aware that Informix's dbspaces can be configured with different page sizes and that a table can be placed in any dbspace you want to place it. By default, when you create a table without an IN <dbspace name> or a FRAGMENT clause that specifies dbspaces, the table is placed in whatever dbspace was used when the database was created, which itself defaults to the ROOTDBS which is a bad idea FWIW. The ROOTDBS, or root dbspace, is created with the system default page size which on AIX and Windows platforms is 4K and on every other platform is 2K. When you create a new dbspace, it can be created with any pagesize that is a multiple of the system default page size up to 16K. Why am I discussing this? Because your client's table width of 133 bytes is rather unfortunate. As noted above, you will be wasting 134 bytes per page or 9.4 bytes per row if you place the table in a 2K dbspace. However, if you place the table in a 4K dbspace, then this happens:

    ROWS_PER_PAGE = (4096 - 24) / (133 + 2) = 4072 / 135 = 30 rows
    WASTE_PER_PAGE = (4096 - 24) - ((4096 - 24) * 30) = 22 bytes
    WASTE_PER_ROW = WASTE_PER_PAGE / ROWS_PER_PAGE = 22 / 30 = 0.73333 bytes

    So you will minimize the wasted disk space substantially in just the initial data load:

    Wasted on 2K pages = 9.42857 * 52655123 = 496,462,513 = ~473MB

    Wasted on 4K pages = 0.73333 * 52655123 = 38613581 = ~37MB

    I hope that covers the subject for you.




    ------------------------------
    Art Kagel
    ------------------------------



  • 14.  RE: Table Extent Size calculation

    IBM Champion
    Posted Wed November 18, 2020 10:19 AM
    Forgot, if you choose 4K pages, then the initial extent size would be 7022400 KB which is, as noted in the waste calculation, about 496 KB less space.


    ------------------------------
    Art Kagel
    ------------------------------



  • 15.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 11:17 AM

    isn't the slot table entry 4 bytes per row ?

     

     

     

     






  • 16.  RE: Table Extent Size calculation

    IBM Champion
    Posted Wed November 18, 2020 11:28 AM
    AARRGGHH!! Yes, thanks Mark. You are correct, the slot entry adds 4 bytes not 2. Don't know where my head is at this week. Maybe not enough coffee, or I'm just getting older. Forgot the slot size, KB instead of pages, and to add in for bitmap pages. Need me some more caffeine!

    I don't think it changes the actual numbers for Amit at all, but anyone using the formulae should make the change.

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 17.  RE: Table Extent Size calculation

    Posted Thu November 19, 2020 03:29 AM
    Data growth is rarely "nice and linear", much better to get some history over the "data cycle" and see what growth fits.

    ------------------------------
    Jon Ritson
    ------------------------------



  • 18.  RE: Table Extent Size calculation

    Posted Wed November 18, 2020 09:01 AM

    Amit

    if you go to my personal website, you'll see links to both spreadsheets and worksheets that walk you through table sizing. They include page size implications as well as used and wasted space calculations. 


    Carlton



    ------------------------------
    Carlton Doe
    ------------------------------



  • 19.  RE: Table Extent Size calculation

    Posted Fri November 20, 2020 12:27 PM
    Wow Amit! You started an interesting conversation. So if you permit me to throw in my two pennies worth...

    As you haven't mentioned the data types used or indexes created for the said table, spend some time to get your NEXT SIZE as accurate as you possibly can. I say this because if you subsequently have to revise your NEXT SIZE by way of ALTER TABLE, the exercise may well prove to be time consuming depending on which algorithm the ALTER TABLE deems relevant to achieve the end goal.

    I also suspect that your customer will want the work completed yesterday with zero downtime !

    There have been various issues with in-place ALTERs over time, but I recently had the pleasure of being greeted by this one...
    IT27997: BFCHECK BAD PAGE PG_FRPTR <SIZE> < SIZEOF(IFX_PAGE_T) 24 OR > SLOTBEG <SIZE> WHEN UPDATING PAGE WITH OUTSTANDING IN PLACE ALTERS
    ...data corruption is the last thing you need late on a Friday, right? Based on the number of rows you anticipate, you may well need to spend time closing down open in-place ALTERs by performing a dummy UPDATE against each row.

    Have fun!





    ------------------------------
    Kirit Rana
    ------------------------------



  • 20.  RE: Table Extent Size calculation

    IBM Champion
    Posted Fri November 20, 2020 12:41 PM
    Hi Kriti,

    please let me set this record straight:  altering next size is completely separate from altering the column list (add,drop,modify on columns) and would neither go into slow nor (fast) in-place alter, it would simply only modify a little entry in the system catalog and maybe on partition header pages, but had no business in dealing with any of the table's data.

    HTH,
     Andreas

    ------------------------------
    Andreas Legner
    ------------------------------



  • 21.  RE: Table Extent Size calculation

    Posted Fri November 20, 2020 01:29 PM
    Edited by System Fri January 20, 2023 04:45 PM
    I stand corrected Andreas. Schoolboy error; Maybe some more homework for me, time permitting. 

    It would only apply to extents going forwards, unless you unload the data first, ALTER TABLE, then re-load.

    Thank you for pointing it out.

    ------------------------------
    Kirit Rana
    ------------------------------