Informix

Informix

Connect with Db2, Informix, Netezza, open source, and other data experts to gain value from your data, share insights, and solve problems.

 View Only
Expand all | Collapse all

SQL query cost is very high

  • 1.  SQL query cost is very high

    Posted Thu November 02, 2023 05:07 AM

    Dear Friends, 

    We are using IDS 14X and the below SQL query is taking high query cost even if it perform INDEX path. Could you pl help us to optimize the SQL query below. 

    select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no

                   and b.connection_type='PRE' and b.disconnected_on is null

                    and trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi') and b.connected_date <= date(a.active_from)

                    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')

                    and a.flag='N' and date(a.active_untill)>'2023-07-31'

    Thanks !

    Best Regards,

    Indika 



    ------------------------------
    Indika Jinadasa
    ------------------------------


  • 2.  RE: SQL query cost is very high

    Posted Thu November 02, 2023 05:09 AM
    Query plan would be useful

    On 11/2/2023 4:06 AM, Indika Jinadasa via IBM TechXchange Community wrote:
    0100018b8f4925bb-37054389-e021-4f73-91b4-f0b956ba7f90-000000@email.amazonses.com">
    Dear Friends,  We are using IDS 14X and the below SQL query is taking high query cost even if it perform INDEX path. Could you pl help us to...





  • 3.  RE: SQL query cost is very high

    Posted Thu November 02, 2023 05:49 AM

    Thanks Watson,  here is the query plan. 

    QUERY: (OPTIMIZATION TIMESTAMP: 11-02-2023 10:22:00)
    ------
    select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no

                   and b.connection_type='PRE' and b.disconnected_on is null

                    and trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi') and b.connected_date <= date(a.active_from)

                    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')

                    and a.flag='N' and date(a.active_untill)>'2023-07-31'


    Estimated Cost: 1485208
    Estimated # of Rows Returned: 32364

      1) informix.a: INDEX PATH

            Filters: ((trunc(informix.a.active_untill , 'mi' ) <= trunc(CURRENT year to fraction(3)+ interval( 24) hour to hour , 'mi' ) AND DATE (informix.a.active_untill ) > 2023/07/31 ) AND informix.a.cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' )) 

        (1) Index Name: informix.idx_cbc_mobile_4
            Index Keys: flag   (Serial, fragments: ALL)
            Lower Index Filter: informix.a.flag = 'N' 

      2) informix.b: INDEX PATH

            Filters: ((informix.b.connected_date <= DATE (informix.a.active_from ) AND informix.b.connection_type = 'PRE' ) AND informix.b.disconnected_on IS NULL ) 

        (1) Index Name: informix.itst
            Index Keys: mobile_no 
            Lower Index Filter: informix.a.mobile_no = informix.b.mobile_no 
    NESTED LOOP JOIN


    Query statistics:
    -----------------

      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                a
      t2                b

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     271329     460904    6456016    00:49.88   294370  

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     248436     6997341   309213     01:02.00   3       

      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   248436     32364     01:52.02   1485208 



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 4.  RE: SQL query cost is very high

    Posted Thu November 02, 2023 06:50 AM

    Indika:

    Here are the issues that I can see:

    • All estimated rows are off versus the actual rows produced. That tells me that your data distributions are either outdated or insufficiently detailed. Run the recommended suite of UPDATE STATISTICS commands, or just use my dostats utility to do that for you.
    • This filter 
               and date(a.active_untill)>'2023-
      07-31'
      is casting active_untill to DATE then comparing it to a date string. The optimizer may be casting it back to a string in order to compare. Better to cast the string to a DATE yourself to make sure that the more efficient comparison is used:
                and date(a.active_untill) > DATE( '2023-07-31' )
    • This filter:
                and trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi')

      is unnecessarily applying the TRUNC() function several hundred thousand times. Better to do casts. In my testing the following will be 25% faster (at least for this operation, the effect on the query as a whole may not be that dramatic, but it might):
                and a.active_untill::DATETIME YEAR TO MINUTE <= (current + interval(24) hour to hour)::DATETIME YEAR TO MINUTE

    Let us know how this goes.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 5.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 04:05 AM

    Hi Art,

    Thank you very much for your tips. 

    Pl. see the SQL query cost for the changed SQ.L. It looks there is not much improvement of the query cost. 

    Best Regards,

    Indika 

        

    QUERY: (OPTIMIZATION TIMESTAMP: 11-03-2023 13:06:57)
    ------
    select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no

                   and b.connection_type='PRE' and b.disconnected_on is null

    --                and trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi') and b.connected_date <= date(a.active_from)
    and a.active_untill::DATETIME YEAR TO MINUTE <= (current + interval(24) hour to hour)::DATETIME YEAR TO MINUTE and b.connected_date <= date(a.active_from)

                    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')

                    and a.flag='N' and date(a.active_untill)>date('2023-07-31')


    Estimated Cost: 1486378
    Estimated # of Rows Returned: 32518

      1) informix.a: INDEX PATH

            Filters: ((informix.a.active_untill ::datetime year to minute<= CURRENT year to fraction(3)+ interval( 24) hour to hour ::datetime year to minuteAND DATE (informix.a.active_untill ) > 2023/07/31 ) AND informix.a.cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' ))

        (1) Index Name: informix.idx_cbc_mobile_5
            Index Keys: flag   (Serial, fragments: ALL)
            Lower Index Filter: informix.a.flag = 'N'

      2) informix.b: INDEX PATH

            Filters: ((informix.b.connected_date <= DATE (informix.a.active_from ) AND informix.b.connection_type = 'PRE' ) AND informix.b.disconnected_on IS NULL )

        (1) Index Name: informix.itst
            Index Keys: mobile_no
            Lower Index Filter: informix.a.mobile_no = informix.b.mobile_no
    NESTED LOOP JOIN

    Query statistics:
    -----------------

      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                a
      t2                b

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     32795      460904    6192146    00:42.24   294368

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     13060      7020750   46800      00:00.46   3

      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   13060      32518     00:42.72   1486378



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 6.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 06:59 AM

    Indika:

    Ignore the "cost" of the query. That is just an internal number used to evaluate different query plans for the current version of the query. You cannot compare the cost between different versions of the query and a better query plan does not necessarily result in a lower "cost". The absolute number for the "cost" is meaningless.

    In this case, those small changes resulted in the query improving from over a minute and a 52 seconds down to 45 seconds. That is not "not much improvement: it is a 60%+ improvement. 

    That said, I still notice that the estimated rows versus actual rows produced is way off. Again that indicates that the data distributions for both tables are either out-of-date or insufficiently detailed. Especially for the larger of the two tables (presumably that's smgt004.

    I also see that you did not eliminate the calls to TRUNC(). I would again recommend that you use casts instead:
           and a.active_untill::DATETIME YEAR TO MINUTE <= (current + interval(24) hour to hour)::DATETIME YEAR TO MINUTE
    because this filter might be satisfied by using your existing idx_cbc_mobile_2 index, especially if you enable multi-index scanning. You can test that by including the optimizer directive for that:

    SELECT {+ MULTI_INDEX( cbc_mobile ) } a.mobile_no, a.cbc_package, a.active_from, a.active_untill ...

    I agree with Henri that you can probably do better even than this 45 second version.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 7.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 08:33 AM

    Hi Art, 

    Thanks for the support. 

    Here is the query statistics . 

    QUERY: (OPTIMIZATION TIMESTAMP: 11-06-2023 18:44:08)
    ------
    SELECT {+ MULTI_INDEX( cbc_mobile ) }a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no
    --select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no

                   and b.connection_type='PRE' and b.disconnected_on is null

    and a.active_untill::DATETIME YEAR TO MINUTE <= (current + interval(24) hour to hour)::DATETIME YEAR TO MINUTE and b.connected_date <= date(a.active_from)

                    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')

                    and a.flag='N' and date(a.active_untill)>date('2023-07-31')


    DIRECTIVES FOLLOWED:
    DIRECTIVES NOT FOLLOWED:
    MULTI_INDEX ( cbc_mobile ) Directives rule out all access paths for table.

    Estimated Cost: 1576120
    Estimated # of Rows Returned: 32518

      1) informix.a: SEQUENTIAL SCAN

            Filters: (((informix.a.active_untill ::datetime year to minute<= CURRENT year to fraction(3)+ interval( 24) hour to hour ::datetime year to minuteAND DATE (informix.a.active_untill ) > 2023/07/31 ) AND informix.a.cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' )) AND informix.a.flag = 'N' )

      2) informix.b: INDEX PATH

            Filters: ((informix.b.connected_date <= DATE (informix.a.active_from ) AND informix.b.connection_type = 'PRE' ) AND informix.b.disconnected_on IS NULL )

        (1) Index Name: informix.itst
            Index Keys: mobile_no
            Lower Index Filter: informix.a.mobile_no = informix.b.mobile_no
    NESTED LOOP JOIN

    Query statistics:
    -----------------

      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                a
      t2                b

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     27918      460904    7079660    00:26.49   384110

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     8802       7020750   41030      00:00.38   3

      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   8802       32518     00:26.89   1576120

    Best regards,

    Indika 



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 8.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 08:59 AM

    Indika:

    Great! So, I see that the MULTI_INDEX directive is being ignored so you don't need that, however, modifying the filters got this down from almost 2 minutes to 56 seconds and then to 42 seconds. Now these changes have changed the processing of the cbc_mobile table from an index scan to a table scan and the runtime is down further to just under 27 seconds! I am not sure that fixing the data distributions with dostats will get you any faster, but it is definitely worth a try. You could also try the MULTI_INDEX directive again at some point if you create additional indexes on the remaining unindexed filters.

    BTW, once you generate proper distributions (AUS missing some tricks that dostats does) AUS will duplicate the same levels of distributions for you going forward, though I would still run dostats manuall or from cron say once a month or so.

    Last note: Referencing that long IN() clause: If that list of strings was extracted from a smallish lookup table using a query with some filters on other columns, then you might try just joining to that lookup table with the original filters on it instead of the IN clause. Sometimes that is faster for this query plus it eliminates the separate query and overhead to build the list.

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------



  • 9.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 06:45 PM

    Hi,

    How many rows in the cbc_mobile table have 

    • flag='N' and active_untill >= '2023-07-31 00:00:00'
    • f;ag='N' and cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' )) A

    ??

    I suspect table cbc_mobile needs an index on  (flag,cbc_package,active_untill)

    Also what indexes are on table smgt004?

    I suspect table smgt004 needs an index on (mobile,connection_type,disconnected_on,connected_date)

    Regards,
    David.



    ------------------------------
    David Williams
    ------------------------------



  • 10.  RE: SQL query cost is very high

    Posted Wed November 08, 2023 09:36 AM

    Hi David,

    Thanks for the support. 

    It has 10043638 records. 

    It has following composite index. ->  (mobile_no,cbc_package,pack_seq_no,active_from,active_untill,added_on,added_by,terminated_on,terminated_by,flag)) and 

    another two indexes  for 'active_until' and 'flag' columns separately. 

    Here are the index details for smgt004 

    cluster index ->     (sim_no,disconnected_on) using btree  in table ;
      Other index columns - >

    (debit_led_acc_code), 
      (debit_led_code, debit_led_acc_code) 
     (subscriber_code)
      (connected_date)
       (disconnected_on)
    (mobile_no)
     (connection_type) 
    (sales_code)
      (status_total)
       
    Best Regards,

    Indika 
        



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 11.  RE: SQL query cost is very high

    Posted Thu November 02, 2023 08:33 AM

    Hi Indika,

    From the statistics at the end of the query plan, it's apparent that you are reading a lot of records, to retrieve a few.  In the read of cbc_mobile, you are reading almost 6.5 million records to return 271,000...about 4%.  That's a lot of records that are read unnecessarily.  Those other 96% of records are being discarded because they don't match the filter conditions.  You should try and find out which of these filter conditions, or combinations of them, are the most restrictive and look for an index on that.  It could be that cbc_package would be a better column to be indexed than the "flag" column, or maybe both, e.g. cbc_package, flag, active_untill.

    As Art said, the use of TRUNC is likely hurting you.  Informix 14 now does some things that makes the use of date or datetime casting unnecessary.  You may get lucky with just removing the date function from date(a.active_untill)>'2023-07-31' or changing it to a.active_untill >= '2023-07-31 00:00:00'.  It may take some variations to get this right, and if the active_untill is not a great filter or is not indexed then it may be of limited benefit.  



    ------------------------------
    Mike Walker
    xDB Systems, Inc
    www.xdbsystems.com
    ------------------------------



  • 12.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 04:14 AM

    HI Mike, 

    Thank  you very much for the tips. Here are the table indexes details. 

    create index "informix".idx_cbc_mobile_1_dup_01 on "informix".cbc_mobile (mobile_no,cbc_package,pack_seq_no,active_from,active_untill,added_on,added_by,terminated_on,terminated_by,flag) using  btree  in idxdb_8k_4;
    create index "informix".idx_cbc_mobile_2 on "informix".cbc_mobile(active_untill) using btree  in idxdb_8k_5;
    create index "informix".idx_cbc_mobile_5 on "informix".cbc_mobile(flag) using btree  in idxdb_8k_5;

    Best Regards,

    Indika 



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 13.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 08:46 AM

    Thank you for this, it is useful.

    Can you please run the following and post the results:

    select count(*) from cbc_mobile;

    select count(*) from cbc_mobile where cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' );

    select count(*) from cbc_mobile where cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' ) and flag = "N";

    Also, try the query, replacing:

       trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi') 

    with

       a.active_untill <= (current + 24 units hour)

    This may not give you the exact same results because of the minute precision but let's see how the query performs.  In fact, what is the definition of the active_untill column?



    ------------------------------
    Mike Walker
    xDB Systems, Inc
    www.xdbsystems.com
    ------------------------------



  • 14.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 07:55 AM

    Hello Mike,

    Thank you very much for your support. 

    Here are the answers to your questions. 

    select count(*) from cbc_mobile ->  10043638

    "select count(*) from cbc_mobile where cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' )- >   9900070

    select count(*) from cbc_mobile where cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' ) and flag = 'N'; ->   7352474


    Set explain out - > 

    QUERY: (OPTIMIZATION TIMESTAMP: 11-06-2023 17:58:33)
    ------
    select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no

                   and b.connection_type='PRE' and b.disconnected_on is null

    --                and trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi') and b.connected_date <= date(a.active_from)
    and a.active_untill <= (current + 24 units hour) and b.connected_date <= date(a.active_from)
    --and a.active_untill::DATETIME YEAR TO MINUTE <= (current + interval(24) hour to hour)::DATETIME YEAR TO MINUTE and b.connected_date <= date(a.active_from)

                    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')

                    and a.flag='N' and date(a.active_untill)>date('2023-07-31')


    Estimated Cost: 3828822
    Estimated # of Rows Returned: 96437

      1) informix.a: INDEX PATH

            Filters: ((DATE (informix.a.active_untill ) > 2023/07/31 AND informix.a.cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' )) AND informix.a.active_untill <= CURRENT year to fraction(3)+ interval(        24) hour(9) to hour )

        (1) Index Name: informix.idx_cbc_mobile_5
            Index Keys: flag   (Serial, fragments: ALL)
            Lower Index Filter: informix.a.flag = 'N'

      2) informix.b: INDEX PATH

            Filters: ((informix.b.connected_date <= DATE (informix.a.active_from ) AND informix.b.connection_type = 'PRE' ) AND informix.b.disconnected_on IS NULL )

        (1) Index Name: informix.itst
            Index Keys: mobile_no
            Lower Index Filter: informix.a.mobile_no = informix.b.mobile_no
    NESTED LOOP JOIN


    Query statistics:
    -----------------

      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                a
      t2                b

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     77911      1366892   6252543    00:29.03   294368
     type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     55843      7020750   97080      00:27.70   3

      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   55843      96438     00:56.78   3828822
    QUERY: (OPTIMIZATION TIMESTAMP: 11-06-2023 17:58:33)
    ------
    select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no

                   and b.connection_type='PRE' and b.disconnected_on is null

    --                and trunc(a.active_untill,'mi')<=trunc(current + interval(24) hour to hour, 'mi') and b.connected_date <= date(a.active_from)
    and a.active_untill <= (current + 24 units hour) and b.connected_date <= date(a.active_from)
    --and a.active_untill::DATETIME YEAR TO MINUTE <= (current + interval(24) hour to hour)::DATETIME YEAR TO MINUTE and b.connected_date <= date(a.active_from)

                    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')

                    and a.flag='N' and date(a.active_untill)>date('2023-07-31')


    Estimated Cost: 3828822
    Estimated # of Rows Returned: 96437

      1) informix.a: INDEX PATH

            Filters: ((DATE (informix.a.active_untill ) > 2023/07/31 AND informix.a.cbc_package IN ('SN' , 'YT' , 'IM' , '9IN1' , '3IN1' , 'TK7' , 'T30' , 'CMB7' , 'CMB30' , '3IN1-1' , '3IN1-5' , '9IN1-2' , '9IN1-7' , 'MX7' , 'MX30' , 'ULCBC' )) AND informix.a.active_untill <= CURRENT year to fraction(3)+ interval(        24) hour(9) to hour )

        (1) Index Name: informix.idx_cbc_mobile_5
            Index Keys: flag   (Serial, fragments: ALL)
            Lower Index Filter: informix.a.flag = 'N'

      2) informix.b: INDEX PATH

            Filters: ((informix.b.connected_date <= DATE (informix.a.active_from ) AND informix.b.connection_type = 'PRE' ) AND informix.b.disconnected_on IS NULL )

        (1) Index Name: informix.itst
            Index Keys: mobile_no
            Lower Index Filter: informix.a.mobile_no = informix.b.mobile_no
    NESTED LOOP JOIN


    NESTED LOOP JOIN


    Query statistics:
    -----------------

      Table map :
      ----------------------------
      Internal name     Table name
      ----------------------------
      t1                a
      t2                b

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t1     77911      1366892   6252543    00:29.03   294368

      type     table  rows_prod  est_rows  rows_scan  time       est_cost
      -------------------------------------------------------------------
      scan     t2     55843      7020750   97080      00:27.70   3

      type     rows_prod  est_rows  time       est_cost
      -------------------------------------------------
      nljoin   55843      96438     00:56.78   3828822

    Table schema 

    create table "informix".cbc_mobile
      (
        mobile_no varchar(10) not null ,
        cbc_package varchar(10) not null ,
        pack_seq_no varchar(100),
        active_from datetime year to second not null ,
        active_untill datetime year to second,
        added_on datetime year to second not null ,
        added_by varchar(15) not null ,
        terminated_on datetime year to second,
        terminated_by varchar(5),
        flag varchar(1)
      ) in edrdata_06  extent size 1000 next size 1000 lock mode row;

    revoke all on "informix".cbc_mobile from "public" as "informix";


    create index "informix".idx_cbc_mobile_1_dup_01 on "informix".cbc_mobile
        (mobile_no,cbc_package,pack_seq_no,active_from,active_untill,
        added_on,added_by,terminated_on,terminated_by,flag) using
        btree  in idxdb_8k_4;
    create index "informix".idx_cbc_mobile_2 on "informix".cbc_mobile
        (active_untill) using btree  in idxdb_8k_5;
    create index "informix".idx_cbc_mobile_5 on "informix".cbc_mobile
        (flag) using btree  in idxdb_8k_5;

        



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 15.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 08:20 AM

    Thanks for this information.  It shows that the other columns used in your where clause (cbc_package and flag) are not very restrictive.  So you need to work on getting that date filter working properly with an index.

    Let's try updating statistics on the active_untill column, and running a query that doesn't use any functions on the column itself.  Again, this may not return the exact results you need, but we are trying to see what performance is like with the index.

    update statistics high for table cbc_mobile(actve_untill);

    select a.mobile_no, a.cbc_package, a.active_from, a.active_untill from cbc_mobile a, smgt004 b where a.mobile_no=b.mobile_no
    and b.connection_type='PRE' and b.disconnected_on is null
    and a.active_untill <= (current year to minute + 24 units hour)
    and b.connected_date <= date(a.active_from)
    and a.cbc_package in ('SN','YT','IM','9IN1','3IN1','TK7','T30','CMB7','CMB30', '3IN1-1', '3IN1-5', '9IN1-2', '9IN1-7', 'MX7', 'MX30', 'ULCBC')
    and a.flag='N' and a.active_untill > "2023-07-31 23:59:59";
     


    ------------------------------
    Mike Walker
    xDB Systems, Inc
    www.xdbsystems.com
    ------------------------------



  • 16.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 02:45 AM

    Hello Indika,

    Thank you for posting the request in the chat.
    That's a little competition to see who has the best optimization ideas and what Informix teamwork can be ;-) 

    Art has already given many hints.

    Here are a few more ideas:

    The query statistic is interesting. The 50 sec. for the scan of the approx. 6.5 million records of t1 (a) sounds very high.
    Did you run the query several times to produce cache read versus phys.read?
    The overhead of the various functions in the filter certainly slows things down - but only with slow CPUs or a lack of CPU resources, or with virtual servers with overcommittment of the CPUs.

    It is similar for T2 .

    Have you ever tried PDQ at this point? And perhaps also the fragmentation of the tables and thus the linear decomposition of the scans?
    Of course, this is only possible from EE and assuming enough cpu power (and cpuvp's).
    I think the query should be executable in less than 5 seconds...that's the competition;-)

    Best Regards
    Henri



    ------------------------------
    Henri Cujass
    leolo IT, CTO
    Germany
    IBM Champion 2021 2022 2023
    IIUG Board Member
    henri.cujass@leolo.com
    ------------------------------



  • 17.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 04:18 AM

    Hi Henri, 

    Thank you very much for the suggestions and will try you suggestions too specially fragmentation. 

    Best Regards,

    Indika 



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 18.  RE: SQL query cost is very high

    Posted Fri November 03, 2023 05:06 PM
    Hi Indika,
    There is a big difference between rows estimated and rows produced. Possibly your statistics are not up to date. You may like to run update statistics on tables involved in SQL and try.

    Regards,
    Nandkishor Singare
    -------------------------------







  • 19.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 08:11 AM

    Hi Nandkishor, 

    Thanks for the support. 

    Here are the summary of the 2 table statistics.  We are using  AUS. 

    dbschema -d <DB> -hd smgt004 |grep -i constr

    Constructed on 2023-11-06 05:18:11.00000
    Constructed on 2023-11-06 05:17:05.00000
    Constructed on 2023-11-06 05:18:49.00000
    Constructed on 2023-11-06 05:13:59.00000
    Constructed on 2023-11-06 05:19:22.00000

    $dbschema -d <DB> -hd cbc_mobile |grep -i constr
    Constructed on 2023-11-05 17:45:32.00000

    Best REgards,

    Indika 



    ------------------------------
    Indika Jinadasa
    ------------------------------



  • 20.  RE: SQL query cost is very high

    Posted Mon November 06, 2023 08:43 AM

    Indika:

    Those dates for the data distributions are not useful as is. We need to know what columns have distributions, the level of distributions (MEDIUM or HIGH) as well as the dates for both tables. Based on your post not all columns have distributions. Also, according your earlier post, you do not have indexes on all of the filter and join columns on these two tables.

    Finally, as you may be aware, I am not a fan of AUS and i STRONGLY suggest that you use my dostats utility to generate possibly better distributions. If you do not already have dostats, you can download the utils2_ak package from my web site at www.askdbmgt.com/my-utiliies where you will find a download button for that package which contains, among others, the source for dostats (dostats_ng.ec). Unpack the file and build it. It will build out-of-the-box on Linux or Solaris (the package contains instructions for modifying the make files for AIX and HP-UX). I have never tried to build it on Windows and there you may have to build your own make file as MS Make is different from its UNIX cousins or just manually compile it with:
    esql -o dostats dostats_ng.ec getopt.c

    Running dostats with no arguments prints out its Usage, but a basic command line is usually good:

    dostats -d <database>

    -- or --

    dostats -d <database> -t <table>

    For very large tables, you may want to increase the number of HIGH distribution buckets with the -R option. For a 10million row table I would suggest -R 0.2 to get 500 buckets instead of the default 200. Note that MEDIUM samples a very small number of rows, so you may want to use the SAMPLING option to update statistics to get a larger sample. Dostats provides the -Z option. Values greater than 1 are processed as an absolute number of rows to sample and values less than 1 (ie decimal fractions) are treated as the fraction of the rows in the table (so -Z 0.1 would sample 10% of the rows).

    Art



    ------------------------------
    Art S. Kagel, President and Principal Consultant
    ASK Database Management Corp.
    www.askdbmgt.com
    ------------------------------