IBM i Global

IBM i 

A space for professionals working with IBM’s integrated OS for Power systems to exchange ideas, ask questions, and share expertise on topics like RPG and COBOL development, application modernization, open source integration, system administration, and business continuity.


#Power


#IBMi
#Power
 View Only
  • 1.  What was the temporary space fiend?

    Posted 2 days ago

    Got a message that our system was approaching threshold.  By the time I got back in this 11.4TB LPAR was back down to 58% used.

    Something ate about 5TB of temp space.

    Current temporary used :   528570 M
    Peak temporary used  . :     5180 G

    I tried the RSS sample for SYSTOOLS.ENDED_JOB_INFO() but the SQL, unaltered, was just showing some small potatoes.  Perhaps the problem was a system task and not so much a 'job'?

    How do I get the details of what was eating this temp space?

    I tried the show sql on the chart.  It's obviously not showing you all the sql.  But enough to get on a trail.  It's obviously creating some CREATE ALIAS type stuff to override to particular members.

    SELECT MAX(INTNUM) AS INTNUM,
           DTECEN || DTETIM AS CSDTETIM,
           MAX(ALL_TEMP_STORAGE_K) AS MAX_ALL_TEMP_STORAGE_K,
           MAX(SYAJOBTMP) AS SYAJOBTMP,
           MAX(SYUSERTMP) AS SYUSERTMP,
           MAX(SYEJOBTMP) AS SYEJOBTMP,
           MAX(SYDBTMP) AS SYDBTMP,
           MAX(SYOSTMP) AS SYOSTMP,
           SUM(JBHSQLSTMT) AS JBHSQLSTMT,
           DTETIM AS DTETIM,
           DTECEN AS DTECEN
        FROM (
                SELECT QSY.INTNUM AS INTNUM,
                       QSY.DTETIM AS DTETIM,
                       QSY.DTECEN AS DTECEN,
                       (SYOSTMP + SYDBTMP + SYAJOBTMP + SYEJOBTMP + SYUSERTMP) *
                           0.0039 AS ALL_TEMP_STORAGE_K,
                       SYAJOBTMP * 0.0039 AS SYAJOBTMP,
                       SYUSERTMP * 0.0039 AS SYUSERTMP,
                       SYEJOBTMP * 0.0039 AS SYEJOBTMP,
                       SYDBTMP * 0.0039 AS SYDBTMP,
                       SYOSTMP * 0.0039 AS SYOSTMP,
                       QJB.JBHSQLSTMT AS JBHSQLSTMT
                    FROM QTEMP.QPFRDATAQAPMJOBMIQ341000002 QJB
                         INNER JOIN QTEMP.QPFRDATAQAPMSYSTEMQ341000002 QSY
                             ON (QJB.INTNUM = QSY.INTNUM)
            ) A
        GROUP BY DTETIM,
                 DTECEN
        ORDER BY CSDTETIM
    ;
     
    I modified it a little but I can't find the particular system task.  I think I'm barking up the wrong tree.  It seems to flag multiple jobs at the same time but I'm guessing that the logic is taking a total temp space used at the time and not the job/task temp used.
     
     
     
    create alias QTEMP.QPFRDATAQAPMJOBMIQ341000002 for qpfrdata.qapmjobmi (q341000002);
    create alias qtemp.QPFRDATAQAPMSYSTEMQ341000002 for QPFRDATA.QAPMSYSTEM (Q341000002);
     SELECT QSY.INTNUM AS INTNUM,
                       QSY.DTETIM AS DTETIM,
                       QSY.DTECEN AS DTECEN,
                       (SYOSTMP + SYDBTMP + SYAJOBTMP + SYEJOBTMP + SYUSERTMP) *
                           0.0039 AS ALL_TEMP_STORAGE_K,
                       SYAJOBTMP * 0.0039 AS SYAJOBTMP,
                       SYUSERTMP * 0.0039 AS SYUSERTMP,
                       SYEJOBTMP * 0.0039 AS SYEJOBTMP,
                       SYDBTMP * 0.0039 AS SYDBTMP,
                       SYOSTMP * 0.0039 AS SYOSTMP,
                       QJB.JBHSQLSTMT AS JBHSQLSTMT,
                       qjb.*, qsy.*
                    FROM QTEMP.QPFRDATAQAPMJOBMIQ341000002 QJB
                         INNER JOIN QTEMP.QPFRDATAQAPMSYSTEMQ341000002 QSY
                             ON (QJB.INTNUM = QSY.INTNUM)
                    order by 4 desc
    ;


    ------------------------------
    Robert Berendt IBMChampion
    Business Systems Analyst, Lead
    Dekko
    Fort Wayne
    ------------------------------


  • 2.  RE: What was the temporary space fiend?

    Posted 2 days ago
    Edited by Satid S 2 days ago

    Dear Robert

    Your post reminds me of a past case (POWER5 or 6 era) in which a few of my customers' run-away queries filled up all the entire free disk space and brought the entire system down. Those were the days when TB size disk was not yet available.  

    Based on the info from your chart that total SQL statements peaked at 19:15 period and the peak temp disk consumption peaked at 19:30 period, please use ACS: Database --> SQL Performance Center GUI to create a Plan Cache Snapshot image for statements that ran after 19:00 and took longer than, say, 10-second runtime and then display the snapshot's content and sort the entries by date-time.  Then look at Visual Explain graph of each statement with long runtime that ran from 19:00 onwards and look for an icon that indicates temp table/data creation in each statement's access plan.   

    If you think this may likely happen again in the future, you can make use of DB2i Query Supervisor (  https://www.ibm.com/support/pages/query-supervisor ) by setting a threshold for a Temp Disk Space size and write an Exit Program to leave a bread crump for you on this event. 

    Query Supervisor Gives Database Engineers New Power: https://www.itjungle.com/2021/04/14/query-supervisor-gives-database-engineers-new-power/     

    Query Supervisor example exit programs:  https://www.ibm.com/docs/en/i/7.4.0?topic=ssw_ibm_i_74/rzajq/qrysuperexitpgmexamples.htm   

    The old Query Governer is also available but it is not as versatile as the newer Query Supervisor. 

    Wishing you a good luck in catching the temp disk hogger. 



    ------------------------------
    Satid S
    ------------------------------



  • 3.  RE: What was the temporary space fiend?

    Posted 2 days ago

    What the plan cache snapshot says was the biggest temp space user wasn't until the next morning.  IDK what UOM temporary storage is but it doesn't seem to reflect experience.

    However that long processing time fits right in the time of death.



    ------------------------------
    Robert Berendt IBMChampion
    Business Systems Analyst, Lead
    Dekko
    Fort Wayne
    ------------------------------



  • 4.  RE: What was the temporary space fiend?

    Posted 22 hours ago
    Edited by TIMOTHY CLARK 22 hours ago

    I wanted to follow up on this thread, because we made some improvements to the way we display the plan cache snapshot data in IBM i 7.6.

    In releases prior to IBM i 7.6, as you captured in your post, the Temporary Storage column displays the estimated temporary storage usage for the query. The actual storage used is available, but you have to open up Visual Explain for that query. There you will see Temporary Storage Used (MB) as an attribute in the VE diagram.

    We found this as confusing as you did, so in IBM i 7.6, we changed the label of the column to be "Average Temporary Storage (MB)." it will display the average of the temporary storage used across all runs.

    It sounds like you found the problematic query, but like others here, I would recommend Query Supervisor for detecting/preventing these kinds of situations in the future.



    ------------------------------
    Tim Clark
    Db2 for IBM i / SQL Optimizer
    ------------------------------



  • 5.  RE: What was the temporary space fiend?

    Posted 2 days ago
    Edited by Aaron Brown 2 days ago

    Hi Robert,

    A lot of the temp storage tooling is based on the QSYS2.SYSTMPSTG view in the SQL services, but that requires that you catch it happening live.  This TechNote helps a bit with that, and potentially the MG tools might have some remaining summary data.  MG tools Temp Storage Summary data is where I was thinking you might look to see if there are any remnant crumbs around.

    https://www.ibm.com/support/pages/understanding-and-investigating-temporary-storage-growth-database-buckets-8-9-11-and-14

    Possibly of some help as well, an article from Dawn May about checking into performance data for temp storage.  This is an excerpt from a whole series, which is mostly about finding the temp storage while the offending process is consuming it - but part 4 talks about some data that might be in your performance data.  The article is a few years old, so the Navigator directions are likely a bit out of date - but worth a shot.

    https://dawnmayi.com/2015/01/06/ibm-i-7-2-improved-temporary-storage-tracking-part-4/

    Outside of that, I would agree with Satid - perhaps putting some temp storage exits in place might help with tracking this in the future.  I've used the older Query Engine exit point in the past to prevent queries from starting that were going to exceed a certain threshold based on the estimator.  (It was a much cruder time - the database changes that happened in V5R2.  We were desperate.  I think everyone remembers those days...)

    Tim Clark has an article series talking about setting limits on queries as well if it helps:  https://developer.ibm.com/articles/i-temporary-storage-and-db/ 

    and 

    https://developer.ibm.com/articles/i-temp-storage-3/



    ------------------------------
    Aaron Brown
    ------------------------------



  • 6.  RE: What was the temporary space fiend?

    Posted yesterday

    Happening again.  Tried QSYS2.SYSTMPSTG.

    Same time as someone is running either that same query or one dang close.

    COGNOS generated query from hell.



    ------------------------------
    Robert Berendt IBMChampion
    Business Systems Analyst, Lead
    Dekko
    Fort Wayne
    ------------------------------



  • 7.  RE: What was the temporary space fiend?

    Posted yesterday

    ... and off to the right was a column for bucket peak size.  That was over 5TB.



    ------------------------------
    Robert Berendt IBMChampion
    Business Systems Analyst, Lead
    Dekko
    Fort Wayne
    ------------------------------



  • 8.  RE: What was the temporary space fiend?

    Posted 10 hours ago
    Edited by Satid S 9 hours ago

    >>>> COGNOS generated query from hell. <<<<

    To prevent such disk hogging query, you can set a limit on Temp Storage for all query/SQL jobs by CRTDUPOBJ the file QAQQINI from QSYS library to QUSRSYS (this library location for QAQQINI will exercise control for all query jobs).  Then you add (or change the value of) the parameter STORAGE_LIMIT to some high value you see fit as a temp storage limit (100 or 200 GB ?).   After this, such bad query will not be allowed to start and Query Governor sends an inquiry message (CPA4259) to the user or job log, indicating the estimated usage and the defined limit. The query will not execute until a response is provided to this inquiry message, allowing for intervention. 



    ------------------------------
    Satid S
    ------------------------------