Planning Analytics

 View Only
  • 1.  TM1 Rollback/Retry for locking contention

    Posted Fri April 28, 2023 08:36 AM

    Hi, I have experienced several lock contention situations and try to understand when TM1 will do a rollback and retry as I observed that sometime it will restart/invoke the thread automatically in around every 60 seconds although the original blocked thread is actually still waiting. This literally causes two same threads or even more running at the same time. But sometimes in other lock contention cases it will wait until the competing thread finishes and then execute nicely.
    1. Under which circumstances, TM1 will do a rollback and retry? I guess it is based on lock contention types, such as IXC, IXCur or WR etc?
    2. Can we force TM1 not to retry if there is a lock content but just waiting or stopping and throwing an error somehow?
    Thanks.



    ------------------------------
    mvp morgan
    ------------------------------


  • 2.  RE: TM1 Rollback/Retry for locking contention

    Posted Mon May 01, 2023 04:22 AM

    As you know locking is part of normal operations, part of TM1's two-phase locking protocol it uses to it's concurrency control. This mechanism guarantees concurrent/exclusive access to objects in the system by threads if and when they need them. When however multiple threads need elevated access to an object they might end up waiting on another thread that already hold a lock on such object that is 'incompatible' with the lock it is requesting. In such a case it ends up being in on of those states (IXC, IXCur or WR) you are referring to above, until such a time that the holder of the initial lock finishes. All business as usual.

    Now to your questions:
    1. Rollbacks [only] happen when they [locking] system detects that threads end up waiting on each other where the lock they are waiting on is held by another thread that also ended up in a waiting state, a so called dead-lock situation. The dead-lock detection then rolls back enough threads to make sure at least one of those threads, otherwise blocked indefinitely, get the chance to continue again.
    2. Other then cancelling an operation, which might not be immediate either, you can not.

     So just to be clear, you'd like to be able to instruct the system that if an operation ended up being rolled back, it simply not be retried and return with an error? What kind of operations are you thinking off? Just any? Are you thinking with TI processes and/or [REST] API level operations?



    ------------------------------
    Hubert Heijkers
    STSM, Program Director TM1 Functional Database Technology and OData Evangelist
    ------------------------------



  • 3.  RE: TM1 Rollback/Retry for locking contention

    Posted Tue May 02, 2023 06:22 AM

    Hi, Hubert.
    Thanks for the context. I get your point of this.
    The dead-lock detection then rolls back enough threads to make sure at least one of those threads, otherwise blocked indefinitely, get the chance to continue again.
    However, from what I tested and observed for this dead-lock case, 
    1. The first Wait:IXCur thread is blocked by a dimension where the other thread is holding READ lock on. This is common.
    2. A second newly restarted same thread is in Wait:IXC state and actually blocked by the #1. This makes sense as the #1 is requesting IX lock on the dimension.
    3. A third newly restarted same thread is still in Wait:IXC state and actually blocked by the #1 in the same way.
    It seems that if the dead-lock situation exists long enough, around every 60 seconds, TM1 will init a new thread repeatedly. Once the the deadlock eventually finishes, then all these restarted threads will be executed subsequently which cause some situation we don't want to see. E.g. extra redundant data

    I totally agree we should resolve the deadlock at the first place. Still, it occurs to me it would more convenient to let TM1 throw out this kind of dead-lock errors after detecting and stop from there.
    I was originally thinking of a system configuration parameter as this is relevant to the TM1 behaviour overall and deadlock usually can not easily be discovered by developers beforehand unless some group of TIs are run in some particular order in parallel. Thanks.



    ------------------------------
    mvp morgan
    ------------------------------