Db2

Expand all | Collapse all

Rows read versus accessed - Table performance

  • 1.  Rows read versus accessed - Table performance

    Posted Wed July 08, 2020 08:15 PM
    Hi team

    looking at table performance view,  I am trying to understand what  'Rows read versus accessed/min'  means. 

    I believe this is trying to give the famous (Rows read / Rows selected)  ratio  which we normally see as an factor , instead of a %. 
    And it seems there is some miss calculation for this column.  I only get 100% or 0%,   nothing in between.. 


    If this metric is trying to the % of rows got selected  from the total of rows read..  something is wrong as I am pretty sure the 0% is not true for these tables. 

    Regards



    ------------------------------
    Samuel Pizarro
    ------------------------------

    #Db2


  • 2.  RE: Rows read versus accessed - Table performance

    Posted Thu July 09, 2020 09:59 PM
    Hi Samuel,

    Thanks for bringing up this issue! We'll investigate and let you know the update. thanks.

    ------------------------------
    KAI DING
    ------------------------------



  • 3.  RE: Rows read versus accessed - Table performance

    Posted Tue August 04, 2020 03:43 AM
    Hi Samuel,

    Here is the git issue to track this defect: https://github.ibm.com/tools-for-aps/zh/issues/19765

    Please verify it in the next DMC release.

    And thank you again for pointing out this defect!

    ------------------------------
    KAI DING
    ------------------------------



  • 4.  RE: Rows read versus accessed - Table performance

    Posted Tue August 04, 2020 09:45 AM

    Hi @KAI DING 

    Thanks for the update...

    I still wonder what this metric will show,  for those extreme cases  where rows_read is extremely higher than rows_selected. 

    like 500.000.000  vs 100 .   What this metric would show ?  ​  

    not sure if % is an appropriate format for this metric.   It has been a factor for a long time.. 



    ------------------------------
    Samuel Pizarro
    ------------------------------



  • 5.  RE: Rows read versus accessed - Table performance

    Posted Sun August 09, 2020 10:14 PM
    Hi Samuel,

    Currently console uses the percentage value (rows_read / rows_selected) as the percentage for this metric. Would you please let us know your thoughts about what would be more appropriate for it? thanks.

    ------------------------------
    KAI DING
    ------------------------------



  • 6.  RE: Rows read versus accessed - Table performance

    Posted Wed August 19, 2020 10:40 AM
    Hi @KAI DING ,  sorry for the long time to answer...   (got stuck in some other urgent issues here) . ​


    | "Currently console uses the percentage value (rows_read / rows_selected) as the percentage for this metric."

    Hum, I guess you mean the opposite right..  as rows_read is always higher than rows_selected.  If you use %,  you should always divide by the higher (total) value.   It does not makes any sense dividing rows_read by rows_selected and present that in a % format.  What that % means ? nothing! 

    Let's put some values as example here...  Look at this picture,  from DSM,   WLM workload summary


    The "ROWS READ / ROWS RETURNED" is pretty clear!  It tells me that I am reading too much rows in order to return only few of them..  

    In other words,  the agent needs to read 70k rows in order to return a single one!   Great!,  interpretation is directly and precise! 

    What,  would be the % value in this case on DMC . 

    If you are really doing  (rows_read / rows_selected)  and present that in % terms,  this is wrong!  this is not percentage..  (statistically speaking) , so the % sign should be removed from it..  and the metric should be presented as a regular number (it's actually a factor ,  how big one value is when compared with the other). 

    But, if DMC is doing  " 100 * (rows_selected / rows_read) %" , using this 1st rows as example..  this would give us : 

     100 * ( 3,697,974 / 48.35 ) = 0.0013% 

    You would need to format the number to have 4 or even 5 decimal precision to have a significant value to show...  (Most of the cases  2 decimals precision wouldn't be enough and they would all be shown as 0.00% ) 

    Not only that..  but reading  76k ,  is much easier to interpret than  0.0013%.   

    The 1st one the interpretation is immediate..  I am reading 70k more rows that I should !  

    when I read 0.0013% ,  I have to make additional math to get how BAD I am doing the rows selection..  


    Regards 



    ------------------------------
    Samuel Pizarro
    ------------------------------



  • 7.  RE: Rows read versus accessed - Table performance

    Posted Fri August 21, 2020 05:21 AM
    Hi Samuel,

    Thanks for you detailed explanation!

    So are you suggesting to use ​"ROWS READ / ROWS RETURNED" instead of "100 * (rows_selected / rows_read) %" as the factor, which indicates how many rows are read for a single selected row?

    I think the latter calculation is something like "hit ratio". The smaller the value is, the poorer the performance is, though the "0.0013%" value is not very clear compared to "70K". Yuan, what's your suggestion?​​

    ------------------------------
    KAI DING
    ------------------------------



  • 8.  RE: Rows read versus accessed - Table performance

    Posted Fri August 21, 2020 01:23 PM
    Edited by System Fri January 20, 2023 04:45 PM

    Hi @KAI DING

    ​| I think the latter calculation is something like "hit ratio". The smaller the value is, the poorer the performance is, 


    I believe the intention was exactly that..   But hit-ratio for BPs (logical vs total_reads)  makes sense..  as , normally the 2 metrics are close to each other. 

    But here,  they won't be !  2 rows_read to 1 rows _selected,  already gives you a 50% hit-ratio.   Most of the times this metric will be too low..  needing several decimal places (4 , or even 5)  to be able to show up something.. 



    ------------------------------
    Samuel Pizarro
    ------------------------------



  • 9.  RE: Rows read versus accessed - Table performance

    Posted Sun August 23, 2020 09:17 PM
    Hi Samuel,

    I created a git issue to track this issue: https://github.ibm.com/tools-for-aps/zh/issues/20040

    Yuan has the same idea as you. I'm also OK to change the comparison to "ROWS READ / ROWS RETURNED" if there is no opposite justification from team.

    ------------------------------
    KAI DING
    ------------------------------



  • 10.  RE: Rows read versus accessed - Table performance

    Posted Mon August 24, 2020 09:04 AM
    Thanks  !

    ------------------------------
    Samuel Pizarro
    ------------------------------



  • 11.  RE: Rows read versus accessed - Table performance

    Posted Sat October 10, 2020 04:12 AM
    Hi Samuel,

    Today I checked code logic and Db2 KC, and I found that we made a mistake (or a typo).

    "rows_read" is the number of rows read from the table. https://www.ibm.com/support/knowledgecenter/SSEPGG_11.5.0/com.ibm.db2.luw.admin.mon.doc/doc/monitor-elements-r.html#r0001317

    "rows_accessed" is actually a calculated field of console, including all accessed rows (read, insert, update, delete).

    There is no "rows_returned" field in "Table Performance" (related table function is MON_GET_TABLE).

    The purpose is to compare how many rows are read VS. how many rows are accessed (read, insert, update, delete).

    So, it's not a "rows_read / rows_returned" issue. If we want to keep the strategy "big number / small number", the result should be "rows_accessed_versus_read".

    BTW: I think I was misled by your picture including "rows_read" and "rows_returned". The picture is actually not TablePerformance. :)

    ------------------------------
    KAI DING
    ------------------------------



  • 12.  RE: Rows read versus accessed - Table performance

    Posted Mon October 12, 2020 08:33 AM

    Ok.  if that is the case,  that's fine!    

    All confusion started because the proximity in terminology used,  and lack of a proper documentation of what each metric really means. ... 

    Nobody (regular users)  would never know that "rows_accessed" would actually mean the sum of other 4 base metrics. 

    as DBA performance analysts,  we need to have such detailed documentation KAI.  otherwise we will make wrong assumptions and wrong analysis.  

    DSM/DMC needs urgently to document ALL their metrics being used,  what they really mean,  how they correlate with real db2 metrics,  and how they are calculated.  I am impressed this is not available yet.  

    So,  for that metric,  the % format would be fine...  

    But,  rows_read / rows_returned  is really a miss.  This metric should also exist.  

    Regards



    ------------------------------
    Samuel Pizarro
    ------------------------------