DataPower

 View Only
  • 1.  sPGA Memory on DB

    Posted 26 days ago

    Hello Community,

    We are connecting to Oracle Database using the SQL DataSource.
    We have the timeouts set at the optimum levels.

    Still we are getting concerns from the Database team saying their SPGA memory for the INACTIVE session is getting high and is not getting cleared automatically from DataPower end.

    If we disable and enable the SQL DataSource i.e reset connections; the DB conforms saying their PGA has improved. 

    What can be the issue for this?
    Although we are not clear exactly whether this issue is on the DataPower or the Database!

    Timeouts on DP
    Connection timeout
    10 Seconds*
     
    Query timeout
    15 Seconds*
     
    Idle connection timeout
    8 Seconds*



    ------------------------------
    Thanks
    ------------------------------


  • 2.  RE: sPGA Memory on DB

    Posted 26 days ago

    This is very interesting as my client recently had a very similar problem.  It wasn't with connections to a DB, but with an MPGW not completely hanging up connections with its back end.  When we reset the service (disable, enable), they cleared immediately.

    Is this happening regularly?  For us, for now, it has happened once, and we believe it might be related to a change on a firewall, but until it happens again, we won't know for certain.

    If you don't mind me asking, what firmware version are you using?

    And, I'd suggest going ahead and opening a case with IBM. 



    ------------------------------
    Joseph Morgan
    CEO - Independent
    Joseph Morgan
    Dallas TX
    ------------------------------



  • 3.  RE: sPGA Memory on DB

    Posted 26 days ago
    Edited by Sunil Chaurasia 26 days ago

    Hi Joseph,

    Thanks for answering.

    It is happening regularly, and the PGA memory is being occupied.
    DB says there are multiple INACTIVE Session which are consuming 500-700 MB's 

    By Firewall, hope you meant about the network firewalls!
    And how can this be causing the Inactive session and how can we detect from our side!

    We are currently on the latest X3 machine.

    You mentioned about MGPW behaving abruptly, can you please elaborate like
    1) Was there issues with persistent connection configuration

    2) Something to notice on backend timeout value?

    Request your suggestions on how you figured the issue and fixed the same.
    :-)



    ------------------------------
    Thanks
    ------------------------------



  • 4.  RE: sPGA Memory on DB

    Posted 25 days ago

    Hi Sunil,

    Yes, I was talking about network firewalls.  We had seen a change ticket related to network firewalls for the evening prior to the onset of our problem.   Some Cisco firewalls have a default setting to not hang up connections.  We still don't know if this was the cause.  

    It could have been the actual application servers, as they were overloaded at the time as well.   What we do know is only one service was affected, because we have considerable metrics, and no other service of well over 100 showed any problem, even many handling 3 and 4 times the amount of traffic.  This represents strong evidence the problem was initiated by the back end servers, but we still don't understand how after hours following the service timeout, the connections were not released.  It was only until we restarted the service the connections cleared.

    These appliances are virtual, so a bit of a difference.

    The error report showed a number of errors related to port exhaustion.   The TCP Summary showed a very large number in FIN_WAIT_2. 

    I'm convinced of two things:  

    1. The problem was created by the back end servers being overloaded.   We were sending 6,000 transactions a second, which is well over the normal 1,000 or so.
    2. Something went wrong somewhere in DataPower to NOT hang up the connections after timeout.   It seems to have become "stuck" at FIN_WAIT.  I cannot tell you the difference between FIN_WAIT_1 and FIN_WAIT_2.  

    So, you might want to monitor TCP Summary and see if any of those metrics grow.

    Check your logs for port exhaustion.

    I'd also look at logs for an increase in backside connection failures and timeouts.  It is interesting to note we did not see an increase in timeouts, whereas we did see a large increase in connection failures.   My guess is once it reached port exhaustion, it could no longer make a new connection.

    I also don't understand why, when it reached port exhaustion, other services were not affected.  Maybe Hermann or someone can add insights to that.



    ------------------------------
    Joseph Morgan
    CEO - Independent
    Joseph Morgan
    Dallas TX
    ------------------------------



  • 5.  RE: sPGA Memory on DB

    Posted 24 days ago
    Edited by Sunil Chaurasia 22 days ago

    Hi Joseph,
    Thanks for answering.

    I would like to know how can we  filtering logs related to Port Exhaustion, i mean any specific event codes.. if any
    Although TCP summary is a dynamic value, I am sharing for current time.

    Established 140 
    Syn-sent
    Syn-received
    Fin-wait-1
    Fin-wait-2
    Time-wait 8556 
    Closed
    Close-wait
    Last-ack
    Listen 27 
    Closing

    Coming to timeouts: We are not getting any abnormal number of connection or timeout failure from the backend/DB Source as well.

    Correct, we have other SQL DataSource but as of now we are getting sPGA concerns from only one DB.

    @Hermanni Pernaa  @Hermann Stamm-Wilbrandt  @Steve Linn      -- Request your suggestion on this thread 



    ------------------------------
    Thanks
    ------------------------------



  • 6.  RE: sPGA Memory on DB

    Posted 21 days ago

    It is really a sum of those you're looking for, but keep an eye on these numbers as you want to figure out the "normal" patterns.   We're pushing that stat to our log collection tool so we can monitor it and examine historical patterns.

    If you are getting sPGA concerns from only on DB, it is beginning to sound like a comparison of DB settings might be in order.

    The event codes we are monitoring are 0x80e003c1, 0x80400a4, 0x01a40008, and 0x804002c.



    ------------------------------
    Joseph Morgan
    CEO - Independent
    Joseph Morgan
    Dallas TX
    ------------------------------