Informix

 View Only
  • 1.  Sequential scans issues

    Posted Thu October 08, 2020 09:19 AM
    Hi,

    I have a high number of sequential scans in tables, as I have more than 3000 tables it is difficult to identify the SQL that originates them.
    Can anyone had experience with similar situations that can give me some help?
    Is there a way to identify the instructions that originate sequential scans on informix?

    Thanks for any help,

    SP

    ------------------------------
    Sergio Peres
    AIRC
    Coimbra
    ------------------------------

    #Informix


  • 2.  RE: Sequential scans issues

    IBM Champion
    Posted Thu October 08, 2020 10:05 AM
    Here Sergio:

    > SELECT  FIRST 100
       stn.dbsname as Database,
       stn.tabname as Table,
       sti.ti_nptotal * pagesize / 1024 as Table_Size_KB,
       sti.ti_nrows as Number_Of_Rows,
       spp.seqscans as Number_Of_Sequential_Scans,
       (sti.ti_nrows * spp.seqscans) AS Total_Rows_Scanned
    FROM  systabnames stn,
          systabinfo sti,
          sysptprof spp,
          sysdbstab sd
    WHERE stn.partnum = sti.ti_partnum
      AND   stn.partnum = spp.partnum
      AND   spp.seqscans > 0
      AND   sd.dbsnum = partdbsnum( stn.partnum )
    ORDER BY 6 DESC, 5 DESC;

    database            syscdr
    table               grid_cdr_node_tab
    table_size_kb       16.0000000000000
    number_of_rows      1
    number_of_sequent+  232044
    total_rows_scanned  232044

    database            sysadmin
    table               ph_task
    table_size_kb       32.0000000000000
    number_of_rows      50
    number_of_sequent+  1026
    total_rows_scanned  51300

    database            sysadmin
    table               aus_command
    table_size_kb       464.000000000000
    number_of_rows      2520
    number_of_sequent+  2
    total_rows_scanned  5040
    ...

    I look to target tables that have very high numbers of scans with moderate numbers of rows and those with moderate numbers of scans but very high numbers of rows. The query will report partitioned tables multiple times.

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 3.  RE: Sequential scans issues

    Posted Thu October 08, 2020 12:21 PM
    Thanks for reply Art,

    The tables I can identify, my problem is knowing what instructions are causing the sequential scans...
    For example, after one work day I get values like 
    I have more than 1000 lines on this report!

    SP

    ------------------------------
    Sergio Peres
    AIRC
    Coimbra
    ------------------------------



  • 4.  RE: Sequential scans issues

    IBM Champion
    Posted Thu October 08, 2020 01:17 PM
    You could do as Scott suggests and query sysmaster:syssqexplain for running queries and filter or grep for the table names that you know are being scanned.

    Another option is to turn on SQLTRACE with a wide enough buffer to capture your widest SQL and big enough to hold enough query history so you can poll sysmaster.syssqltrace and its child pseudo-tables at a reasonable periodicity without losing too many queries (gaps in the query id number will tell you how many you missed between polls) to a file that you can grep -f <file> with a file filled with the interesting table names.

    The sqltrace data doesn't contain the number of sequential scans, but if the number of bufreads is significantly higher than the number of rows processed, then you know a scan of a large table did happen!

    Also, I disagree with Scott. Not all small table scans should be ignored. Of course a scan of a table that takes up only a few pages is fine, normal, and expected, but scans of middle sized tables that add 1/20sec to a query but occur 5million times a day are adding 250,000 seconds or almost 70 hours combined to your daily processing across multiple applications!

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 5.  RE: Sequential scans issues

    Posted Thu October 08, 2020 11:02 AM
    Sergio,
     
    Some things about sequential scans:
     
    Detection:
     
    1) Any small table will get a sequential scan.
     
    2) Generally, I look at those tables with more than 60,000 rows as these are the ones that can do the most damage.
     
    3) Defined, a sequential scan is the search of table without the aid of an index. So a table being searched with 7 million rows and no index available is always a problem.
     
    4) Typically it's user queries in user tables you want to look at. So you could exclude the Informix provided databases from your data on tables doing sequential scans (so seqscans > 0 and nrows > 60,000) and dbsname not in ("sysmaster","sysadmin",etc.). Sort in reverse order on sequential scans. This gives you a place to start looking.
     
    5) What would be even better is to look at the queries actually doing sequential scans.
     
    6) Live running queries can be found in sysmaster:syssqexplain.sqx_sqlstatement (this captures the first 32,000 bytes of the query in memory) where sysmaster:syssqexplainsqx_seqscan > 0. Here these is a small problem, informix will run parts of a query with a multi-part where clause as a sequential scan, such as a join to a table with less than 2000 rows. So capturing all of the running sequential scan queries into a file, multiple times at different times of the week, chopping them out into separate files, and adding set explain on avoid_execute over the top of a query and sending the query plan to sqexplain.out will capture how informix will run the query without executing it. Knowing the originating source code files pays off big time here.  Informix has something called the estimated numeric cost of running a query. Any query running a sequential scan will have an abnormal estimated cost (normally you see single digits or up to 3 digits for properly executing queries) and having an estimated cost of > 10,000 is worth looking at, especially if it is running a lot in a lot of sessions.
    Any query with a sequential scan with an estimated cost > 100,000 is definitely a problem and those more than 1,000,000 are likely to take longer than anyone's lifetime to return an answer and therefore worth fixing.
     
    So it is a bit involved, but this is how you do it and gradually, you prove your case to the programming staff this way by having the proof. Otherwise, they will say it is the responsibility of the dba to do it. Save all of your set explain outputs as evidence. Of course, if you are the dba and programmer ........ you have to fix it.
     
    7) Make sure you are running update statistics as indicted in the scheduler.
     
    Solutions:
     
    one or more of these:
     
    1) Query rewrite
    2) Update statistics
    3) New Index creation
     
    There are several other possibilities for slow systems as well, but for sequential scans, this is a good start.
     
    Scott Pickett
    IBM Informix WW Technical Sales
    IBM Informix WW Cloud Technical Sales
    IBM Informix WW Cloud Technical Sales ICIAE
    IBM Informix WW Informix Warehouse Accelerator Sales
    Boston, Massachusetts USA
    spickett@us.ibm.com
    617-899-7549
    33 Years Informix User
     
    The current Informix Roadshow presentations are here:
     
     
    The current internal ZACS Informix Page can be found here:
     
     
     





  • 6.  RE: Sequential scans issues

    Posted Thu October 08, 2020 11:14 AM
    Slight rewrite of my previous to take care of a typo:
     
    6) Live running queries can be found in sysmaster:syssqexplain.sqx_sqlstatement (this captures the first 32,000 bytes of the query in memory) where sysmaster:syssqexplain.sqx_seqscan > 0. Here these is a small problem, informix will run parts of a query with a multi-part where clause as a sequential scan, such as a join to a table with less than 2000 rows. So capturing all of the running sequential scan queries into a file, multiple times at different times of the week, chopping them out into separate files, and adding set explain on avoid_execute over the top of a query and sending the query plan to sqexplain.out will capture how informix will run the query without executing it. Knowing the originating source code files pays off big time here.  Informix has something called the estimated numeric cost of running a query. Any query running a sequential scan will have an abnormal estimated cost (normally you see single digits or up to 3 digits for properly executing queries) and having an estimated cost of > 10,000 is worth looking at, especially if it is running a lot in a lot of sessions.
    Any query with a sequential scan with an estimated cost > 100,000 is definitely a problem and those more than 1,000,000 are likely to take longer than anyone's lifetime to return an answer and therefore worth fixing.
    Scott Pickett
    IBM Informix WW Technical Sales
    IBM Informix WW Cloud Technical Sales
    IBM Informix WW Cloud Technical Sales ICIAE
    IBM Informix WW Informix Warehouse Accelerator Sales
    Boston, Massachusetts USA
    spickett@us.ibm.com
    617-899-7549
    33 Years Informix User
     
    The current Informix Roadshow presentations are here:
     
     
    The current internal ZACS Informix Page can be found here:
     
     
     





  • 7.  RE: Sequential scans issues

    Posted Thu October 08, 2020 01:07 PM
    Thanks for reply Scott,

    I don't have very long tables, I think one of the problems is related with indexes...
    I also looked over unused indexes and we are looking over the list to identify some of them.
    Also we use functions like NVL that can have this effect, but we are looking for one quick way to find specific problems because the traces we have done don't give 
    significant information.

    SP

    ------------------------------
    Sergio Peres
    AIRC
    Coimbra
    ------------------------------