Informix

nested-group-icon.png

DB2

Expand all | Collapse all

Is there a size of row function for rows with varchars?

  • 1.  Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 07:39 PM
    Hi Folks.

    I have a new situation where there are tables with varchars and lvarchrs so that their maximum row size is in the area of 2300 or 2500 bytes and some tables pushing 5000.  But all the DBspaces are 2K page spaces.  I'd like to average the actual row sizes of each table.  That might be facilitated by a function I could call form SQL, like:
    select thisrowsize() from <table> where keycolumn = <value>
    or
    select average_rowsize from <table>

    Owing to varchars, this size will vary from row to row the rowsize in the catalogs or sysmaster (like systabinfo.ti_rowsize) is of no help here.  But I'm trying to determine if they would get any space saving using a larger page size.

    WOW! I got the whole question out in less than 1000 words!  An accomplishment for Mr Concise-not! ;-)

    Thanks for guidance here.

    ------------------------------
    Jacob Salomon
    ---
    Nobody goes there anymore, it's too crowded.  --Attr: Yogi Berra
    ------------------------------


  • 2.  RE: Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 08:06 PM
    Jacob:

    Here you go:

    select st.tabname, sum(pg_pagesize - pg_frcnt)/avg(pt.nrows) as avg_rowsize
    from syspaghdr sp, sysptnhdr pt, systabnames st
    where sp.pg_partnum = pt.partnum
        and st.partnum = pt.partnum
        and st.dbsname = 'my_database'
        and st.tabname = 'my_table'
        and mod( sp.pg_flags, 2 ) = 1
    group by 1
    ;



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 3.  RE: Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 08:17 PM
    OK, I made a shell script out of it, and refined it a bit:

    ================================================
    #!/usr/bin/bash
    if [[ $# -ne 2 ]]; then
    echo "$0 databases tables"
    echo " "
    echo " databases and tables can be MATCHES wildcards!"
    echo " "
    exit 1
    fi

    dbs=$1
    tab=$2

    if [[ $dbs = "all" ]]; then dbs='*'; fi
    if [[ $tab = "all" ]]; then tab='*'; fi

    dbaccess sysmaster - <<EOF 2>/dev/null
    select st.dbsname, st.tabname, ceil( sum(pg_pagesize - pg_frcnt)/avg(pt.nrows))::int as avg_rowsize
    from syspaghdr sp, sysptnhdr pt, systabnames st
    where sp.pg_partnum = pt.partnum
    and st.partnum = pt.partnum
    and st.dbsname matches "$dbs"
    and st.tabname matches "$tab"
    and mod( sp.pg_flags, 2 ) = 1
    group by 1, 2
    ;
    EOF


    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 4.  RE: Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 09:02 PM
    WOW, Art!  That was terrific!

    I found that one table whose max row size shows up as 4365 actually has an average row size of ~488 bytes.  Now I need to scratch my head a trifle.  Or better, your head.  :-)

    In a DBspace with a 2K page: If I had a table with varchars at a max size of 1600 bytes per row and I insert a 700-byte row, that leaves only 1316 bytes left in the page.  By the rules of varchars, the engine will NOT another row in that page because is has fewer than 1600 bytes to spare.  This wastes those 1316 bytes, of course.

    Question: Does that same reasoning apply when the max rows size is >2016 bytes and I have inserted that 700 byte row?  Because NO filled-out row would fit on an empty page, does the engine make an intelligent decision and insert another 600 byte row in that page?  I suspect not; it would still apply its "another whole row wouldn't fit" algorithm.

    I've generalized your query a bit:
    select st.dbsname, st.tabname, sum(pg_pagesize - pg_frcnt)/avg(pt.nrows) as avg_rowsize
    from syspaghdr sp, sysptnhdr pt, systabnames st
    where sp.pg_partnum = pt.partnum
        and st.partnum = pt.partnum
        and st.dbsname = '(my database)'
      --and st.tabname = '(my table)'    -- Commeted out to get all tables
        and mod( sp.pg_flags, 2 ) = 1
    group by 1, 2
    ;
    
    It's running a loooong time but I think I'll get some info I need.  But in that extreme case I described, I really should start gently nagging them to move the tables to 16K page DBspaces.

    Remind me, please: What is that lowest order bit of syspaghdr.pg_flags?

    Again, thanks MUCH!

    ------------------------------
    Jacob Salomon
    ---
    Nobody goes there anymore, it's too crowded.  --Attr: Yogi Berra
    ------------------------------



  • 5.  RE: Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 10:49 PM
    Jacob:

    Ok, yes, unless MAX_FILL_DATA_PAGES is set no additional rows can be placed on a page that already has data on it unless the max sized rows would fit. But with MAX_FILL_DATA_PAGES set new rows can go on a page as long as 10% of the page will be free after placing the new row.

    The low order bit set in the page header flag indicates a data page (versus bitmap, index, blob, etc.).

    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 6.  RE: Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 10:56 PM

    Art & company,
    my more general version of the query was still running over an hour later so I cancelled it.

    I had asked:
    > Remind me, please: What is that lowest order bit of syspaghdr.pg_flags?
    Answer is in sysmaster:flag_text: It is a data page.
    select * from flags_text where tabname = "syspaghfdr"
    tabname syspaghdr
    flags 1
    txt Data Page

    tabname syspaghdr
    flags 2
    txt Partition Descriptor Page

    tabname syspaghdr
    flags 4
    txt Partition Free List Page

    etc..
    Now I am uncertain if a remainder page is also a data page.  I can run another query to check that out but it's late and I'm getting foggy.  Also, I prefer an explicit bitmap function to using the mod function which works only for the lowest order bit.  So here's the query as I've mangled it.  Art, you have created a monster! ;-)

    select st.dbsname,
           st.tabname,
           sum(pg_pagesize - pg_frcnt)/avg(pt.nrows) as avg_rowsize
      from syspaghdr sp, sysptnhdr pt, systabnames st
     where sp.pg_partnum = pt.partnum
       and st.partnum    = pt.partnum
       and st.dbsname    = '(database)'
       and st.tabname    = '(table)'
       and bitval(sp.pg_flags, 1) = 1  -- I care only about data pages
       and sp.pg_nslots >= 1           -- with at least 1 row
       and bitval(sp.pg_flags, 512) = 0 -- but not remainder pages
     group by 1, 2
    ;


    It would appear that this able that motivated my original question has no remainder pages.  Interesting waste of space, ain't it?



    Another improvement over your original query:  I suspect your query counted empty pages into the avg(pt.nrows). I added the clause:
    and sp.pg_nslots >= 1 -- with at least 1 row




    ------------------------------
    Jacob Salomon
    ---
    Nobody goes there anymore, it's too crowded.  --Attr: Yogi Berra
    ------------------------------



  • 7.  RE: Is there a size of row function for rows with varchars?

    Posted Mon November 30, 2020 11:09 PM
    A remainder page has 9 set in the flag (so 8+1). so yes, the '1' bit is set on remainder pages as well. Good catch to eliminate empty pages! 

    FWIW, the page flags are documented in my printfreeB.ec utility along with other interesting 'bits' of juice (pun intended)!

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 8.  RE: Is there a size of row function for rows with varchars?

    Posted Tue December 01, 2020 10:36 AM
    Hi Art.

    Although you answered my question, the horse ain't dead yet and the monster you created is back to haunt you this fine morning.  :-)

    I added another column here, out of curiosity.  You know, that trait that was implicated in the demise of the feline...

    avg(sp.pg_frcnt) as avg_free_bytes

    Somewhat paradoxical result, though that does not yet prove it wrong.
    dbsname        (my database)
    tabname        (my table)
    avg_rowsize    487.633630716036
    avg_free_bytes 960.388235294118

    These quantities to not add to to the expected 2020.  So I ran this query:
    select st.dbsname[1,10],
           st.tabname[1,20],
           sp.pg_pagenum,
           sp.pg_nslots,        -- Number of rows in the page
           sp.pg_frcnt          -- Free bytes in the page
      from syspaghdr sp, sysptnhdr pt, systabnames st
     where sp.pg_partnum = pt.partnum
       and st.partnum    = pt.partnum
       and st.dbsname    = '(my database)'
       and st.tabname    = '(my table)'
       and bitval(sp.pg_flags, 1) = 1  -- I care only about data pages
       and sp.pg_nslots >= 1           -- with at least 1 row
       and bitval(sp.pg_flags, 512) = 0 -- but not remainder pages
    ;​
    
    dbsname          tabname           pg_pagenum pg_nslots pg_frcnt
    
    (my database)    (my table)         1         11         991
    (my database)    (my table)         2         10         934
    (my database)    (my table)         3         11         962
    (my database)    (my table)         4         11        1010
    (my database)    (my table)         5         11         948
    (my database)    (my table)         6          9        1002
    (my database)    (my table)         7         10         945
    (my database)    (my table)         8         10         954
    (my database)    (my table)         9         11         923
    (my database)    (my table)        10         11         892
    (my database)    (my table)        11         10         985
    (my database)    (my table)        12         11         922
    (my database)    (my table)        13         11         986
    (my database)    (my table)        14         11         956
    (my database)    (my table)        15         10        1011
    (my database)    (my table)        16         11         953
    (my database)    (my table)        17         11         924
    (my database)    (my table)        18         10         983
    (my database)    (my table)        19         10         981
    (my database)    (my table)        20         11         936
    (my database)    (my table)        21         10        1012
    (my database)    (my table)        22         11         924
    (my database)    (my table)        23         12         894
    (my database)    (my table)        24         11         987
    (my database)    (my table)        25         10         982
    (my database)    (my table)        26         10         917
    (my database)    (my table)        27          9        1007

    (I cut it off in after the first few lines.)  So the average free count I got from my added clause, ~960, is indeed consistent  with the above per-page results.  But I see multiple slots used in each page.  I did check on the MAX_FILL_DATA_PAGES parameter and found it to be 0.  (I also checked out the environment; no such variable is set.)

    Puzzling...  
    So I added avg(sp.pg_nslots) as avg_slots into your query.  A bit over 11 slots per page. Not nearly as much space wasted per page as I originally thought.

    It seems unlikely from a human standpoint that MAX_FILL_DATA_PAGES was originally set when they loaded the table but then switched to a new onconfig and forgot to turn that on.  We are using release 12.10.FC13(another 2-letter suffix).

    Definitely gotta talk to someone about that!




    ------------------------------
    Jacob Salomon
    ---
    Nobody goes there anymore, it's too crowded.  --Attr: Yogi Berra
    ------------------------------



  • 9.  RE: Is there a size of row function for rows with varchars?

    Posted Wed December 02, 2020 11:17 AM
    Edited by Jacob Salomon Wed December 02, 2020 11:51 AM
    Obsessive-compulsive disorder is a positive trait for programmers, DBAs, any anyone who codes.

    Art, when I first ran your query, it showed an average row size of about 488 bytes for the table.  However, when I ran my page-by-page query, it showed, for example, (the first item) a page with 11 slots and 991 free bytes.  This means 11 rows in 1029 bytes, which averages 93.5 bytes per row.  This is not consistent with the average of 488 bytes from your calculation.  I think I've fixed it, however:
    select st.dbsname[1,10],
           st.tabname[1,20],
           round(avg(sp.pg_nslots), 2) as avg_slots,
         --sum(sp.pg_pagesize - sp.pg_frcnt)/avg(pt.nrows) as avg_rowsize,
           round(avg((sp.pg_pagesize - sp.pg_frcnt)/sp.pg_nslots), 2) as avg_rowsize,
           round(avg(sp.pg_frcnt),2) as avg_free_bytes
      from syspaghdr sp, sysptnhdr pt, systabnames st
     where sp.pg_partnum = pt.partnum
       and st.partnum    = pt.partnum
       and st.dbsname    = '(my database)'
       and st.tabname    = '(my table)'
       and bitval(sp.pg_flags, 1) = 1  -- I care only about data pages
       and sp.pg_nslots >= 1           -- with at least 1 row
       and bitval(sp.pg_flags, 512) = 0 -- but not remainder pages
     group by 1, 2
    ;
    Result:
    dbsname         (my database)
    tabname         (my table)
    avg_slots       11.14
    avg_rowsize     95.51    <--- This is more consistent with the page-by-page query
    avg_free_bytes  960.61

    It is still a mystery to me how more than row gets into a page when MAX_FILL_DATA_PAGES is 0 but that's for me to discuss with the main DBA here.  I have no real access to the database itself so it's not possible (yet) to determine which version is correct.  But when I modified my page-by-page query to check for average row length, I found only 3 rows longer than 125 bytes. (Actually, 121; 4 bytes for the slot-table entry).

    Related mystery, but from the other side of the coin, is why there is so much free space in every blessed page.  If MAX_FILL_DATA_PAGES is being ignored many pages should pretty full.  In fact there are exactly 2 pages (out of ~7500) with under 850 free bytes. As Yul Brynner sings: This is a puzzlement! :-)

    Still working on it..



    ------------------------------
    Jacob Salomon
    ---
    Nobody goes there anymore, it's too crowded.  --Attr: Yogi Berra
    ------------------------------



  • 10.  RE: Is there a size of row function for rows with varchars?

    Posted Thu December 03, 2020 01:29 PM
    Sorry about that mess of code & data.  I thought the {;} code designation would protect my SQL and neat columns ftmm overly intelligent (as in intelligent donkey, in more colloquial terms >;-)  text formatting.