Programming Languages on Power

Power Programming Languages

IBM Power, including the AIX, IBM i, and Linux operating systems, support a wide range of programming languages, catering to both traditional enterprise applications and modern development needs.


#Power

 View Only
Expand all | Collapse all

Prepared statement for select ... into

  • 1.  Prepared statement for select ... into

    Posted Mon September 16, 2024 10:11 AM

    Recently I worked on improving the performance of an application that repeatedly ran a set of 200 queries (all 'insert into t1 select ... from ... where ...'). The queries were being run using execute immediate, and the application had to stop the query analyser spending time to optimise the queries otherwise performance was horrible (QAQQINI: REOPTIMIZE_ACCESS_PLAN *ONLY_REQUIRED). I got a very significant improvement by creating prepared statements for each of the queries (once) and running the prepared statements repeatedly. This makes sense since it saves having to analyse the statement every time.

    prepare s1 from 'insert into t1 select ... from t2 where t2.c2 = ?';      // This is executed just once
    ...
    execute s1 using :var1;     // This runs many times & works a treat

    So I'm enthusiastic and wonder if I can use 'Prepare' elsewhere to improve performance - In particular for 'select... into' statements? These are used commonly in my shop and I have seen cases where they seem to be expensive.

    prepare s2 from 'select c1 into :? from t1 where t1.c2 = ?';      // This would be executed just once - but prepare with select into is not allowed
    ...
    execute s2 using :var1, :var2;     // This would be run many times

    The answer seems to be 'no', the closest I have found is to prepare a cursor and use fetch into, but every time selection criteria changes I would have to close the cursor and open it again for the new criteria - much more code than simply running 'select... into' again and probably slower.  

    I want to know if running 'select into' many times is slow in the same way that running 'insert into T1 select...' seems to be slow - with the query engine having to evaluate the statement every time. If 'select into' is evaluated every time is there any way to stop this, such as we can for 'select into' using the prepare statement.

    Thanks for your help

    Colin Grierson



    ------------------------------
    Colin Grierson
    ------------------------------

    #SQL


  • 2.  RE: Prepared statement for select ... into

    Posted Mon September 16, 2024 11:03 AM

    Hi Colin,

    1. Do you really need dynamic SQL for a SELECT ... INTO? Normally most SELECT-Statements you may think have to be executed dynamically, can be executed statically. The only thing where you cannot use HostVariables are Schemas and/or Tables/Views. If you have to access different tables/views and/or different schemas, you have to use dynamic SQL
    2. SELECT ... INTO cannot be executed dynamically ... but VALUES ... INTO can.

    StringSQL = 'VALUES(SELECT Col1, Col7 FROM YOURTABLE ... WHERE   Fld1 = ? and Fld2 = ? ...) into (?, ?, ?, ?);
    EXEC SQL PREPARE DynSQL From :StringSQL;
    EXEC SQL EXECUTE DynSQL Using :WhereFld1, :WhereFld2, :OutCol1, :OutCol2;

    You have to work with parameter markers (?). The first ? is replaced with the first Hostvariable specified in the EXECUTE Statement.



    ------------------------------
    Birgitta Hauser
    Database and Software Engineer
    Birgitta Hauser - Modernization-Education-Consulting on IBM i (selfemployed)
    Kaufering
    +49 170 5269964
    ------------------------------



  • 3.  RE: Prepared statement for select ... into

    Posted Mon September 16, 2024 05:31 PM

    Thank you Birgitta! That looks like exactly what I was searching for. I'll do some tests and see if I can measure the performance of the two techniques.



    ------------------------------
    Colin Grierson
    ------------------------------



  • 4.  RE: Prepared statement for select ... into

    Posted Mon September 16, 2024 08:29 PM
    Edited by Satid S Mon September 16, 2024 09:35 PM

    Dear Colin

    When you mention "selection criteria changes", does this mean different combination of selected columns (and different ORDER BY, GROUP BY and join columns) were specified or all columns remain the same but search values change?   If the former, it can mean that Db2 may need different indexes for run time performance of each of those different criteria. Did you use Visual Explain invoked from Plan Cache dump to ensure all your different SELECTs used indexes (except one that selects too many rows from the table) as opposed to table scan. 

    Since DB2 SMP is now free of charge if you have an active SWMA for IBM i 7.x, you may want to utilize it for SQL/Query performance as well if your total CPU power is not highly utilized yet. 



    ------------------------------
    Satid S
    ------------------------------



  • 5.  RE: Prepared statement for select ... into

    Posted Mon September 16, 2024 10:01 PM

    Thanks Satid

    I was looking for some way to 'prepare' a 'select ... into ... ' statement. Brigitta gave me exactly what I was looking for.



    ------------------------------
    Colin Grierson
    ------------------------------



  • 6.  RE: Prepared statement for select ... into

    Posted Tue September 17, 2024 01:55 AM

    Hi Birgitta

    I tested Prepare (values into) / Execute against native IO and Select into. Unsurprisingly native IO was fastest, but I was surprised that Select into was second and Prepare / Execute was a clear last. With the Insert statements using Prepare improved things a lot. (Yes, I did the prepare just once and Execute many times) I also got date/time errors when using Prepare/Execute that did not happen with Select into.

    My test reads a given number of records from a file, using native IO and arrival sequence and for each record fetches a row from table 2. Here are the results. The table read has 8 columns including a date and time. The logical file used for the native IO is a perfect index for the SQL

    1. call x_test1 ('0' x'0100000f')    // This just reads the input file, fetching nothing                 
         Elapsed seconds: .30 CPU milliseconds used: 125
    2. call x_test1 ('1' x'0100000f')    // Native IO to a data structure           
         Elapsed seconds: 1.09 CPU milliseconds used: 391
    3. call x_test1 ('1A' x'0100000f')    // Native IO to fields         
         Elapsed seconds: 1.36 CPU milliseconds used: 549 
    4. call x_test1 ('2' x'0100000f')      // Select into a data structure        
         Elapsed seconds: 3.46 CPU milliseconds used: 1289
    5. call x_test1 ('2A' x'0100000f')      // Select into one variable per column       
         Elapsed seconds: 3.59 CPU milliseconds used: 1300
    6. call x_test1 ('3' x'0100000f')        // Execute prepared Values... Into, one variable per column omitting date and time fields       
         Elapsed seconds: 4.96 CPU milliseconds used: 1993

    Not only was using the prepared Values Into statement clearly slowest, it died with a Date/Time error if I included the date or time field. This was the case even when I changed the query statement to return hex(<date field>) so it seemed to be within the SQL side of things, not related to mapping the value to the program field.

    I repeated these tests several times with consistent results. Native IO is significantly faster than Select into and reading into a data structure (even with only 8 columns) is noticeably faster than reading into variables. Actually they are all very fast, even the slowest method took only 5 seconds to retrieve 100,000 rows one by one!

    Thanks for your help

    Regards, Colin



    ------------------------------
    Colin Grierson
    ------------------------------



  • 7.  RE: Prepared statement for select ... into

    Posted Wed September 18, 2024 03:04 AM
    1. Native I/O no Optimization, i.e. the specified physical/logical file without any optimization
    2. SELECT ... INTO into a data structure --> at least 1 FULL OPEN (Full optimization - time consuming), return a single pointer. If you execute the same SELECT ... INTO multiple times (even with different Host Variables) in a loop. The sub-sequent executions should be (much) faster, since a PSEUDO OPEN can be performed.
    3. SELECT ... INTO into multiple Variables (same as before, but multiple Pointers have to be returned). BTW you should avoid SELECT * and only select what you exactly need. ... Optimizer then could perhaps perform an Index Only Access (i.e. it can use an index where all information you need are in the key fields so accessing the row itself is not necessary).
    4. VALUES ... INTO - dynamic SQL. While static SQL is already known at compile time, for dynamic SQL Syntax analysis and converting the String into an executable Statement have to be done at runtime. ... after a FULL OPEN must be performed. That's why I asked if you really need dynamic SQL. Also when using dynamic SQL a FULL OPEN is always performed with the PREPARE statement. You could work with parameter markers and save the previous SQL Statement and then in a sub-sequent call compare the previously saved SELECT statement and the newly one. If both are identical you can skip the PREPARE and directly execute the EXECUTE. In this case also a PSEUDO OPEN can be perform which can be 10-20 times faster than a FULL OPEN.

    Also you should not mesure the (run)time of a SQL Statement after the first run. It will be always a FULL OPEN (and this is time consuming since the ODP - Open Data Path has to be created/opened). Sub-Sequent executions (PSEUDO OPENS) are much faster, because the ODP is still there and only the data has to be updated.

    Reducing FULL OPENs to a minimum can result in a much better performance.



    ------------------------------
    Birgitta Hauser
    Database and Software Engineer
    Birgitta Hauser - Modernization-Education-Consulting on IBM i (selfemployed)
    Kaufering
    +49 170 5269964
    ------------------------------



  • 8.  RE: Prepared statement for select ... into

    Posted Sun September 22, 2024 07:21 PM

    Thanks Birgitta for explaining what is happening under the covers.

    In my case I want to optimize a job that processes many thousands of rows, all in the same way - using a fixed set of keys fields and output columns. The total time is what I am interested in.

    // using select into
    do until end of test loop;
       read table1 to get keys;
       select ... into :var1, :var2, ... var8 from table2 where key1 = :keyvar1 and key2 = :keyvar2;
       process stuff...
    enddo;

    From what you say this will do one full open and many pseudo opens. All good.

    // using a prepared statement
    prepare stmt from values (select col1, col2, ...col8 from table2 where key1 = ? and key2 = ?)  into ?, ?, ?, ?, ?, ?, ?, ?;
    do until end of test loop;
       read table1 to get keys;
       execute stmt using keyvar1 and key2 = :keyvar2, :var1, :var2, ... var8;
       process stuff...
    enddo;

    From what you say this should also do one full open and many pseudo opens. I expected this loop to be the faster of the two but actually it is significantly slower - and seems to be buggy

    I had also expected native IO to be faster but not by so much - when reading to a data structure it took only 1/4 the time of what SQL took using Select into

    Regarding selecting all columns into a data structure using '*', I can see the attraction when many columns are required, however doing this means the program will need to be recompiled if the table is changed - this is a major disadvantage

    Regards, Colin



    ------------------------------
    Colin Grierson
    ------------------------------



  • 9.  RE: Prepared statement for select ... into

    Posted Mon September 23, 2024 12:42 AM

    NO! If you always run PREPARE followed by EXECUTE you have always a FULL OPEN (with the PREPARE!).

    What you can do to get a single FULL OPEN followed by multiple PSEUDO OPENs is, to save the String with the SQL Statement. If the sub-sequent SQL Statements are identical, you skip the PREPARE and run immediately the EXECUTE. In this situation, you will get a single FULL OPEN and multiple PSEUDO OPENs. 

    And if it is always the same Statmenten, dynamical SQL (PREPARE and EXECUTE) makes no sens. A SELECT ... INTO or a static VALUES ... INTO will be better



    ------------------------------
    Birgitta Hauser
    Database and Software Engineer
    Birgitta Hauser - Modernization-Education-Consulting on IBM i (selfemployed)
    Kaufering
    +49 170 5269964
    ------------------------------



  • 10.  RE: Prepared statement for select ... into

    Posted Wed September 25, 2024 07:44 PM

    Db2 for i does have the ability to detect the repeated PREPARE & EXECUTE of the same statement and avoid Full Opens.  However, this detection doesn't work in all situations, so the PREPARE a statement once and then repeating the EXECUTE is considered best practice.



    ------------------------------
    Kent Milligan
    ------------------------------



  • 11.  RE: Prepared statement for select ... into

    Posted Wed September 25, 2024 07:55 PM

    I do prepare once and execute many times, but it is still significantly slower than running select ... into. I presume the system can optimise the static SQL better than the prepared statement.

    Thanks for your help

    Colin



    ------------------------------
    Colin Grierson
    ------------------------------



  • 12.  RE: Prepared statement for select ... into

    Posted Thu September 26, 2024 12:12 AM

    Static SQL & Dynamic SQL have different default settings that can cause the query optimizer to choose a different implementation.  Sometimes that may result in Dynamic SQL performing faster, sometimes slower.



    ------------------------------
    Kent Milligan
    ------------------------------



  • 13.  RE: Prepared statement for select ... into

    Posted Thu September 26, 2024 03:48 AM
    Edited by Daniel Gross Thu September 26, 2024 03:48 AM

    Hi Colin,

    that schema is almost screaming "JOIN" - because, what you are doing is basically a SQL JOIN by hand.

    So without knowing too much about the underlying problem, I would try so solve it this way:

    select table1.key1, table1.key2, table1.key3, ...
           sum(table2.val1), max(table2.val2), ...
    from table1
    join table2 on table2.key1 = table1.key1
               and table2.key2 = table1.key2
               and table2.key3 = table1.key3
               ...
    where ...
    group by table1.key1, table1.key2, table1.key3, ...
    order by table1.key1, table1.key2, table1.key3, ...

    This will solve many problems in one statement. Use a cursor to read the lines in just ONE loop.

    When approaching a problem using SQL - the main performance optimization you can do, is always to minimize the amount of SQL statements to process. In general, its always best, if the only SQL statement inside a loop is a FETCH.

    Work your way using iACS "Run SQL Scripts" to build your cursor SELECT statement - and do as much as you can in that SELECT ... up to the point where it gets nasty. That barrier will shift towards "more SQL less RPG" the more you use this technique.

    And if you need to update data - try to update as many rows at once - not 1 row at a time. If you have "dynamic input" from one table, that has to be updated into another table - try MERGE.

    SQL was never intended, and will never be, an replacement for native I/O. Native I/O with SETLL/READE, CHAIN and so on was always a crutch, because the systems at that time, were not able to process "mass data operations". 

    Thats just my 2ct and my very personal opinion.

    HTH

    Daniel

    P.S.: If you need more help - feel free the post more details or send me a PM.

    [Edit: corrected a typo in SQL statement]

    ------------------------------
    Daniel Gross
    ------------------------------