WebSphere Application Server & Liberty

 View Only
  • 1.  Node Agent Shutting down Intermittently every day

    Posted Tue March 05, 2024 12:10 AM
    Edited by Chaitanya Kumar R Tue March 05, 2024 09:01 AM

    Hi Everyone,

    We have a clustered environment with 2 node agents and 2 Servers. We are seeing the node agents are getting down frequently every day at the same time.In the Server Logs we didn't see anything but in the node agent Logs, The Following messages are getting logged at Windows OS.

    DiscoveryTx   W   DCSV1115W: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: Member Cell01\Webnode01\RSrv connection  was closed. Member will  be removed from view. DCS connection status is Discovery|Ptp, transmitter closed. Removing closed connection.
    TransportAdap W   DCSV1117W: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: The stream from Member Cell01\Webnode01\RSrv has closed. The channnel is View|Ptp. 
    DiscoveryTx   W   DCSV1115W: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: Member Cell01\Webnode01\MXUI-1 connection  was closed. Member will  be removed from view. DCS connection status is Discovery|Ptp, transmitter closed. Removing closed connection.

    TransportAdap W   DCSV1117W: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: The stream from Member Cell01\Webnode01\MXUI-1 has closed. The channnel is View|Ptp. 
    DiscoveryTx   W   DCSV1115W: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: Member Cell01\Webnode01\nodeagent connection  was closed. Member will  be removed from view. DCS connection status is Discovery|Ptp, transmitter closed. Removing closed connection.
    TransportAdap W   DCSV1117W: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: The stream from Member Cell01\Webnode01\nodeagent has closed. The channnel is Connected|Ptp. 
    VSyncAlgo2    I   DCSV2004I: DCS Stack DefaultCoreGroup at Member Cell01\Webnode02\nodeagent: View synchronization completed successfully. The View Identifier is (9:0.Cell01\CellManager01\dmgr). The internal details are The leader equals Cell01\CellManager01\dmgr, ccv equals ((9:0.Cell01\CellManager01\dmgr).3.0).
     CoordinatorIm I   HMGR0228I: The Coordinator is not an Active Coordinator for core group DefaultCoreGroup. The active coordinator set is [Cell01\CellManager01\dmgr].

    Help me how to fix this error.

    Thank you in advance for the Response.



    ------------------------------
    CKK
    ------------------------------



  • 2.  RE: Node Agent Shutting down Intermittently every day

    Posted Wed March 06, 2024 01:54 AM

    Normally related to Coregroups/HA manager...you may also need to look into buffer size of the Coregroup.

     - You should see warnings from the nodeagent JVM reporting that they can't talk to the DMGR for example. Check Nodegagents SystemErr.log too for info. I would look for High CPU usage errors or oddly enough OOM/HeapDumps as in Systhrow message.  

    The message normally relates to Coregroup related issues - It looks for connections that closed because the underlying socket was closed. When a failed member is detected because of the socket closing mechanism, the following message is logged in the SystemOut.log file for the surviving members:

    DCSV1115W: DCS Stack DefaultCoreGroup at Member anzioCell01\anzio\ServerD:
    Member anzioCell01\anzio\ServerC connection  was closed. Member will  be removed from view.
    DCS connection status is Discovery|Ptp, transmitter closed.



    ------------------------------
    Joe Molina
    ------------------------------



  • 3.  RE: Node Agent Shutting down Intermittently every day

    Posted Wed March 06, 2024 01:58 AM

    Got trigger happy...here some Info:

    DCSV1115W : DCS Stack DefaultCoreGroup at Member <this member>: Member <remote member> connection was closed. Member will be removed from  view. DCS connection status is Discovery|Ptp , transmitter closed. 
    Are all indications that a socket closed. These can be expected when a process is administratively stopped or killed.
    These messages indicate the remote process that closed the socket will be removed from view. If these messages occur,
    but the remote process was not stopped or kill, it is an indication of some other problem.


    ------------------------------
    Joe Molina
    ------------------------------



  • 4.  RE: Node Agent Shutting down Intermittently every day

    Posted Wed March 06, 2024 05:33 AM

    Hello Joe,

    Thank you for the response.

    As you mentioned, i have gone through the SystemErr Log Files of the Node Agent and Server. But i couldn't find any errors there.

    I am not sure what to do next. Kindly advise.

    Thank you 



    ------------------------------
    C K K
    ------------------------------