Sterling Managed File Transfer

Sterling Managed File Transfer

Come for answers, stay for best practices. All we're missing is you.

 View Only

SCBN Standard & Premium data overload and solution with external purge

By Tanvi Kakodkar posted Tue March 03, 2020 02:33 AM

  
Originally updated on 16th September 2019 by VinceTkac

Authors:    

Vince Tkac

STSM, IBM Cognitive Apps

vtkac@us.ibm.com

Doug Brown

IBM Cognitive Apps

dcbrown@us.ibm.com

 

Josh Rice

Architect, IBM Cognitive Apps

jmrice@us.ibm.com

 

SCBN Standard & Premium is a hosted, high volume, business to business data processing system that incorporates B2Bi to help process millions of business documents per month.  The SCBN Standard & Premium services oriented architecture makes use of IBM B2Bi software deployed in "pods".  A pod is a two node instance of B2Bi with a DB2 database connected to the message bus using IBM WebSphere.  SCBN Standard & Premium makes use of B2Bi for both protocols and data transformation.  This article discusses an index and purge issue encountered on a B2Bi data transformation pod and the process used to remediate it.  

 

In one of these pods, the B2Bi index and purge processes fell behind.  Data builds up quickly and the normal processes were not able to catch up.  We were at roughly 240 million entries in WORKFLOW_CONTEXT and 65 million in ARCHIVE_INFO.  The external purge job was taking hours just to get a list of BPs to purge (see QUERY1) and the BP indexer was was marking 100k entries every 13-16 hours.  

 

Query plans showed high record counts causing problems even for healthy indexes.

 

As a non-intrusive option, we wanted to move data out of the primary work tables to reduce the result counts and speed up the maintenance queries.  We would do this for both ARCHIVE_INFO and WORKFLOW_CONTEXT thus allowing external purge and BP_index to complete.  The intent was to then slowly move the ARCHIVE_INFO and WORKFLOW_CONTENT data back into the primary tables for external purge and BP_index to work on as normal.  

 

A custom DB2 stored procedure was created to move records out of ARCHIVE_INFO and into a save table.  This took a few days to run but was able to reduce ARCHIVE_INFO to a more normal working size of around 23 million rows.  DB2 runstats was executed at this point and we could see the initial selection query from external purge return quickly.  

 

The stored procedure moves 100k records at a time from ARCHIVE_INFO to a save table.  It was run in parallel across 10 different connections with different date ranges to move data faster.  With the data moved out of the ARCHIVE_INFO table, we could see the external purge query against ARCHIVE_INFO complete very quickly.  Over the course of a week, external purge was able to reduce ARCHIVE_INFO down to around 7 million rows.  We trickle fed rows from the ARCHIVE_INFO save table back into the primary ARCHIVE_INFO table for external purge to start picking up.  We ended up not having to move any data out of WORKFLOW_CONTEXT.  

 

This is not a fast process.  It took approximately 3 weeks to move data out, get the system stable, trickle feed data back in and get the system back to a stable self sustained state but this process required no outage or downtime.  

 

 

QUERY 1:

SELECT wf_id, group_id, archive_flag, archive_date 

FROM   archive_info 

WHERE  wf_id IN (SELECT wf_id 

                 FROM   (SELECT DISTINCT wf_id, Row_number() 

                                           OVER ( 

                                             ORDER BY wf_id) AS RANK_ROW 

                         FROM   archive_info 

                         WHERE  archive_info.archive_date < ? 

                                AND ( archive_info.archive_flag = 1 

                                       OR archive_info.archive_flag = 2 )) AS 

                        LEVEL1 

                 WHERE  rank_row < ?) 

       AND archive_info.archive_date < ? 

       AND archive_info.archive_flag IN ( 1, 2 ) 

       AND archive_info.group_id IN ( 1, 2 ) 

 

 

 

Setup:

CREATE TABLE "ARCHIVE_INFO_INDEXER" (

        "WF_ID" BIGINT NOT NULL,

        "GROUP_ID" INTEGER NOT NULL,

        "ARCHIVE_FLAG" INTEGER,

        "ARCHIVE_DATE" TIMESTAMP,

        "WFC_START_DATE" TIMESTAMP

    )

    DATA CAPTURE NONE

    COMPRESS YES

 

CREATE INDEX "AII_ARCHIVE_INFO_PERF_IDX1"

    ON "ARCHIVE_INFO_INDEXER"

    ("ARCHIVE_FLAG" ASC, "GROUP_ID" ASC, "WF_ID" ASC)

    MINPCTUSED 0

    ALLOW REVERSE SCANS

    PAGE SPLIT SYMMETRIC

    COMPRESS YES

 

CREATE INDEX "AII_SCI_IDX_176"

    ON "ARCHIVE_INFO_INDEXER"

    ("WF_ID" ASC, "ARCHIVE_FLAG" ASC)

    MINPCTUSED 0

    ALLOW REVERSE SCANS

    PAGE SPLIT SYMMETRIC

    COMPRESS YES

 

CREATE INDEX "AII_SCI_IDX_177"

    ON "ARCHIVE_INFO_INDEXER"

    ("ARCHIVE_DATE" ASC, "ARCHIVE_FLAG" ASC, "GROUP_ID" ASC, "WF_ID" ASC)

    MINPCTUSED 0

    ALLOW REVERSE SCANS

    PAGE SPLIT SYMMETRIC

    COMPRESS YES

 

CREATE INDEX "AII_SCI_IDX_178"

    ON "ARCHIVE_INFO_INDEXER"

    ("ARCHIVE_FLAG"  ASC,  "WF_ID" ASC)

    MINPCTUSED 0

    ALLOW REVERSE SCANS

    PAGE SPLIT SYMMETRIC

    COMPRESS YES

 

 CREATE UNIQUE INDEX "AII_ARCHIVE_INFO_PK"

    ON "ARCHIVE_INFO_INDEXER"

    ("WF_ID" ASC, "GROUP_ID" ASC)

    MINPCTUSED 0

    ALLOW REVERSE SCANS

    PAGE SPLIT SYMMETRIC

    COMPRESS YES

 

ALTER TABLE "ARCHIVE_INFO_INDEXER" ADD CONSTRAINT "AAI_ARCHIVE_INFO_PK" PRIMARY KEY

    ("WF_ID", "GROUP_ID")

     

CREATE INDEX "AII_IDX_2"

    ON "ARCHIVE_INFO_INDEXER"

    ("WFC_START_DATE"  ASC)

    MINPCTUSED 0

    ALLOW REVERSE SCANS

    PAGE SPLIT SYMMETRIC

    COMPRESS YES

 

 

Stored Procedure 1:

  create or replace procedure AII_MOVE_ARCHIVE_INFO_TO_OFFLINE (

    in pStart timestamp,

    in pEnd timestamp

    )

begin

    

    insert into ARCHIVE_INFO_INDEXER

    (WF_ID, GROUP_ID, ARCHIVE_FLAG, ARCHIVE_DATE, WFC_START_DATE

     )

    (select  a.WF_ID, a.GROUP_ID, min(a.ARCHIVE_FLAG), min(a.ARCHIVE_DATE), max(w.START_TIME) from

    archive_info a inner join

    workflow_context w on

    a.wf_id = w.workflow_id

    where a.archive_date is null

    and w.start_time between pStart and pEnd          

    group by a.WF_ID, a.GROUP_ID    

    order by a.wf_id, a.group_id

    fetch first 100000 rows only)

    ;

    

    delete from archive_info where

     (wf_id, group_id) in

     (

         select a.wf_id, a.group_id from archive_info a

         inner join ARCHIVE_INFO_INDEXER AAI

         on a.wf_id = aai.wf_id --and a.group_id = AAI.group_id

         where aai.WFC_START_DATE between pStart and pEnd

    );

    commit;

end

#

 

# sample call

#call AII_MOVE_ARCHIVE_INFO_TO_OFFLINE ('2000-01-01', '2018-01-01')

 

Once the backlog was clear, data was moved back into the ARCHIVE_INFO table from the save table.  Only about 5 million rows were moved at a time by calling this stored procedure multiple times.  Monitored the count of un-indexed data in the ARCHIVE_INFO table and moved more data over when it got low enough.

 

Stored Procedure 2:

create or replace procedure MoveAAIDataBack ()

begin

        declare movedate timestamp;

        

        select max(WFC_START_DATE) into movedate from (

        select * from ARCHIVE_INFO_INDEXER

        order by WFC_START_DATE

        fetch first 100000 rows only

        )

        for read only with ur;

 

        insert into ARCHIVE_INFO (wf_id, group_id, archive_flag, archive_date)

        select wf_id, group_id, archive_flag, archive_date from archive_info_indexer

        where WFC_START_DATE <= movedate;

        

        delete from ARCHIVE_INFO_indexer where WFC_START_DATE <= movedate ;

end

#

 


#IBMSterlingB2BIntegratorandIBMSterlingFileGatewayDevelopers
#DataExchange
0 comments
11 views

Permalink