IBM i Global

IBM i 

A space for professionals working with IBM’s integrated OS for Power systems to exchange ideas, ask questions, and share expertise on topics like RPG and COBOL development, application modernization, open source integration, system administration, and business continuity.


#Power


#IBMi
#Power
 View Only
  • 1.  DB2 allocate

    Posted 2 days ago

    Hi,

    While running against storage constraints I was reviewing the concept of large database fields and the allocate statement once more.

    I was under the impression that by default DB2 SQL would always store fields inline unless I specified the allocate keyword, however when double checking this I noticed the following "If the allocated length specification is omitted, an allocated length of 0 is assumed." meaning that in such case the data is always stored in the overflow area (and not inline).

    So if I define a column in DB2 via SQL as for example LastName VarChar(100)... this would not be stored inline but in an overflow area and thus have a performance impact ?  I find this a bit strange, I always thought it was the opposite ?

    Source: CREATE TABLE - IBM Documentation



    ------------------------------
    Paul Nicolay
    ------------------------------


  • 2.  RE: DB2 allocate

    Posted 2 days ago
    That's correct - data longer than the ALLOCATEd length ( in your case zero ) will end up in the overflow area and require another i/o.

    Best regards

         Stefan

     








  • 3.  RE: DB2 allocate

    Posted 2 days ago
    Edited by Paul Nicolay 2 days ago

    In the mean while I found the overflow information in qsys2.syscolumnstat and it indeed confirms your answer.

    It seems however this is something that many people forget, even I did as re-reading my own documentation stated that tables should be defined with ALLOCATE 🤭

    Question remains a bit if it is usefull (to specify a smaller allocate length) for typical fields like firstname, lastname, street, ... that are in the range of 1 to 50 bytes.

    Also does it make sense to allocate the maximum for a field as when having one column that goes in the overflow area, you've the performance hit anyway (expect when you explicitely specify the column on the select list which is another discussion).  

    The answer to all this is probably... it depends.



    ------------------------------
    Paul Nicolay
    ------------------------------



  • 4.  RE: DB2 allocate

    Posted 2 days ago
    Edited by Satid S 2 days ago

    Dear Paul

    >>>>> So if I define a column in DB2 via SQL as for example LastName VarChar(100)... this would not be stored inline but in an overflow area and thus have a performance impact ?  I find this a bit strange, I always thought it was the opposite ? <<<<

    Perhaps. you feel it is strange only because you are not aware of its design. Not long ago, someone asked similar line of question as yours (wasn't it  you?) and Db2i expert Kent Milligan provided the following explanation :

    [QUOTE]

    VARCHAR data is not split between the two, it's designed such that it's stored in EITHER the fixed OR overflow area.

    ........

    That's why recommendation is for an allocate value where 90-95% of the column values can be stored in the fixed area.

    [UNQUOTE]

    So, it appears that it is sensible to always specify ALLOCATE and with a sensible length.  

    By the way, regarding your concern on disk space, another important point is to try to reduce the total number of deleted records in tables and one efficient way to achieve this is to set table attribute Reuse Deleted Records to *YES.  This attribute is set to *YES automatically for a tables created by SQL CREATE TABLE.  Not so for tables created by DDS and you need to use CHGPF to change the attribute.  

    You can identify tables with a lot of unpurged deleted records by querying DB2 catalog view SYSTABLESTAT:

    SELECT table_name, table_schema, data_size, number_rows, number_deleted_rows from QSYS2.SYSTABLESTAT ORDER BY number_deleted_rows FETCH FIRST 50 ROWS ONLY.

     

     



    ------------------------------
    Satid S
    ------------------------------



  • 5.  RE: DB2 allocate

    Posted 2 days ago
    Edited by Paul Nicolay 2 days ago

    Unless my memory fails on me, it wasn't me asking that question before.

    The quote "VARCHAR data is not split between the two, it's designed such that it's stored in EITHER the fixed OR overflow area." is however something that is not clear to me.  

    VARCHAR(50) without allocate... that's clear, it always goes to the overflow

    VARCHAR(50) with ALLOCATE(30)... below 30 characters it is stored inline, but is the quote claiming that beyond 30 characters they ALL (even the initial 30 bytes) go to the overflow area ?

    VARCHAR(50) with ALLOCATE(50)... that's clear as well, it's all stored inline.

    Another question is what if I have two VARCHAR fiels, one without allocate and one with the max length... I presume they follow these rules individually (or is the quote implying something different ?).

    And finally, having tens of columns all with VarChar (or even CLOB's) do I only have a single overflow area (or can there be multiple ones... having multiple additional IO's).



    ------------------------------
    Paul Nicolay
    ------------------------------



  • 6.  RE: DB2 allocate

    Posted 2 days ago

    I've embedded my responses below###

    1) VARCHAR(50) without allocate... that's clear, it always goes to the overflow

    ###Mostly true, if the column is 30 bytes or less Db2 may change the allocate value to a non-zero value.

    2) VARCHAR(50) with ALLOCATE(30)... below 30 characters it is stored inline, but is the quote claiming that beyond 30 characters they ALL (even the initial 30 bytes) go to the overflow area ?

    ### If a 32 byte string is stored  in the column, then all 32 bytes will be stored in the Overflow area.  The 30 allocated inline bytes will be allocated, but not used. 

    VARCHAR(50) with ALLOCATE(50)... that's clear as well, it's all stored inline.

    3) Another question is what if I have two VARCHAR fields, one without allocate and one with the max length... I presume they follow these rules individually (or is the quote implying something different ?).

    ### The Allocate rules apply to each individual column.

    4) And finally, having tens of columns all with VarChar (or even CLOB's) do I only have a single overflow area (or can there be multiple ones... having multiple additional IO's).

    ### There's a single Overlflow area.  Also, when one Variable length column is referenced in a query - all of the columns referenced in the overflow  area get paged into memory.  This is the same behavior as a table with all fixed-length columns - if one column is referenced in a query, all the columns in the record definition are paged into memory.  However, the difference is fixed length column definitions won't include a very long  BLOB or CLOB value.



    ------------------------------
    Kent Milligan
    ------------------------------



  • 7.  RE: DB2 allocate

    Posted yesterday

    Thanks for the clarification, however I still have some questions left.

    • What is the idea behind defaulting the size to zero if not specified on allocate ?  In this case I'm not talking about larg(er) fields for which I would understand the default, but for smaller fields like FirstName, LastName, Address, ItemCode, ... or would you recommend to keep on using Char for such fields ?   Should the default be the same as the field length then there would be no usage of the overflow area by default and no double IO hit.  For me a default should be fine for the majority which doesn't feel like that now.
    • Why do I see in the statistics some fields twice, once without (like the majority) and once with a translation table *SYS ?  The field in question does have a CCSID which is identical to QCCSID so nothing special.

    Thanks,
    Paul



    ------------------------------
    Paul Nicolay
    ------------------------------



  • 8.  RE: DB2 allocate

    Posted 14 hours ago

    The default behavior  for variable length columns was to reduce disk storage that's why allocate defaults to zero (except  for this case-  Db2 may change the allocate value to a non-zero value).  You have to remember that storage was much more expensive when the VARCHAR types were introduced in V2R1.1.

    A statistic associated with a translation table means there are queries being run by programs/interfaces using an alternative sort sequence. Some sort sequences would change the value of the statistics that Db2 collects.



    ------------------------------
    Kent Milligan
    ------------------------------