Cognos Analytics

Cognos Analytics

Connect, learn, and share with thousands of IBM Cognos Analytics users! 

 View Only
  • 1.  JVM Startup Tips/Tricks (10.1.1)

    Posted Fri September 18, 2020 08:40 AM

    All,

    We just recently added two new distributed nodes to our Cognos 10.1.1 (yes, I know) infrastructure.

    We have a geographically distributed installation (different datacenters for each server) with two hardware based solaris 10 machines (1 processor, 16 cores, 8gb ram, 100mb/s lan, slow disk) and just added two new solaris 11 virtual machines (1 processor, 8 cores), 24gb ram, 1gb/s lan, fast disk).

    We have the application tier and content manager tier split, but resident on the same server listening on different ports.

    Normally on the older machines, it takes ~15 minutes to startup the JVM (maximum memory in mb 768), and another 10 minutes to start the cognos services after the JVM is completed) for the primary content manager, and a similar timeframe for all the application tier dispatchers (1158mb JVM). 

    On the new machines, our application tier components start up in about 3 minutes with the same 1158 mb (much much faster).

    However, when attempting to fail over the content manager from the older machines to the newer machines, it took much much longer -- we stopped the entire platform when we hit 45 minutes of essentially amounted to inactivity.  Based on tuning recommendations, we increased the memory allocation on the backup content manager to 2gb prior to this failover. 

    The current primary content manager co-exists with the content store in the same DC and thus has very little latency, but the other DCs have < 100 ms database latency as well, despite being geographically distributed.


    Can anyone provide any tips/ setttings to get the JVM to be built faster (java startup params) so that the JVM build doesn't take 20 minutes to start?  The startup times seem to be excessive, and the downtime associated with a content manager failover is prohibitive.



    ------------------------------
    Dax Lawless
    ------------------------------

    #CognosAnalyticswithWatson


  • 2.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Tue September 22, 2020 09:24 AM

    I don't think the java engine is your issue, but the connection between standby CM in DC2, where as the CM database still resides in DC1.
    Trying to find network bandwidth and latency architecture statements for CA came up short, for for the old BI I found this:

    https://www.ibm.com/support/knowledgecenter/SSEP7J_10.2.2/com.ibm.swg.ba.cognos.crn_arch.10.2.2.doc/c_adgplnnfrastrct.html
    "IBM Cognos BI server components should be connected by a network with 100 Mb of available capacity"

    If replicating the CM database to your DC2 is not in scope, i would suggest to create a support call with IBM and label it "information request" and ask them if the statement for BI still stands for CA.



    ------------------------------
    STEFAN VERMEULEN
    ------------------------------



  • 3.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Mon September 28, 2020 04:50 AM
    Hello @STEFAN VERMEULEN, @Dax Lawless

    CognosConfig:
    * disabled not needed features (like mobileService)  --> saves RAM and startup time
    * removed not needed languages --> saves RAM and startup time
    ... see github for our script cleaning up Cognos instanances (cleanUpCognosInstallation)

    Do you use a standby CM or do you start the new DC2 when DC1 fails?
    --> With an already running standby CM, we are able to switch on failing primary CM within seconds. See StandBy Cm in documentation Our setup is just like yours - we also run servers in DC1 + DC2 in different locations.
    --> Cognos services are started automatically after the server(s) is rebooted. The primary Content Manager sometimes starts as standby. One of the secondary Content Managers starts as active. (see Details)

    One of our Cognos DEV servers runs 12 installations. So, we are running out of RAM and DISK-SPACE on the host.
    We found an article about shared classes on disk and in memory ... for Websphere Liberty (WLP). See details on sharedClasses. This could be a good solution for us. Has anyone used sharedClasses among different installations of Cognos? It might also help you to speed up the startup process.

    Do you have any clue what WLP is doing "all those 45 minutes"? ntrace/strace java process? Enable trace logs in WLP? Debug the JAVA process to understand what is happenig? Here is a link on how to detect hungs or loops in Oracle java 

    hth

    ------------------------------
    Ralf Roeber
    ------------------------------



  • 4.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Wed September 30, 2020 05:41 AM
    We investigated on WLP shareClasses options - none of the options described are respected by WLP. 

    There is an active issue with openLiberty describing that files in wlp/usr/servers/cognosserver/workarea are duplicated every time the server is startet.

    We posted a question to the support forum 

    Any suggestions on how to control the OSGi storage or simply disabling the cache?

    ------------------------------
    Ralf Roeber
    ------------------------------



  • 5.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Wed September 30, 2020 08:49 AM
    This question is in regards to tomcat and Cognos 10.1.1, so it's not WLP that is having the issue.  I put the below parameters into the bootstrap xml file for my Cognos 11.0.13 UAT installation, and it did seem to noticibly speed up the startup times:

    <param condName="${java_vendor}" condValue="IBM">-DcmcacheFetchSize=1000</param>
    <param>-DdefaultRowPrefetch=1000</param>

    What I do not know is whether this will impact Tomcat inside Cognos 10.1.1 under JDK 1.6.

    ------------------------------
    Dax Lawless
    ------------------------------



  • 6.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Wed September 30, 2020 08:52 PM
    10.1.1 was prior to IBM JRE, so that xml snippet won't be valid for it.

    Looking at your original post I would question your original game plan.  Your old servers take 25 minutes to startup is the issue, have you been through the logs looking at the timestamps of the startup process?.   For me that would be nothing short of a "rip it out and start again" scenario.   I assume you're doing something like this: the old servers are out of HW support, and the new servers are a stop gap till you actual upgrade to v11.   Question: Why are you trying to keep the old servers around?  Just schedule an outage, full shutdown, start up new machines, test failover between those two.

    You're not going to be able to get much in the way of help for such an old version.

    ------------------------------
    Nick McCoy
    ------------------------------



  • 7.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Wed September 30, 2020 09:06 PM
    NIck,

    That was the ultimate goal, but as it stands right now, our failover testing between the old and new nodes hasn't been great, so this question is partially for the new nodes.  Despite the new hardware being much more powerful than the old.  

    We are looking to get some sort of detail as to what is going on under the hood so that we can tweak what we can while we have to keep the system alive.

    ------------------------------
    Dax Lawless
    ------------------------------



  • 8.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Wed September 30, 2020 09:26 PM
    Why would you test failover of old > new then?  Wouldn't you just test failover of new primary CM to new standby CM?  Personally, I would setup the new 10.1.1 servers as a parallel environment, and depending on your gateway setup have a change over outage, export/import of content into a fresh DB as well as any DB level backup/copy is likely to bring issues with it. I've done a lot of fixes on older environments and quite frankly it's a bit of time sink.

    Other things I would do:
    Check the CMCAPACITY table in the content store as well, see if the new CM's are registered there the same as the old. 
    Check cmplst.txt files to make sure they're identical (could have been an IF you're missing)
    Increase logging level, go through logs and pin point the delay in failover





    ------------------------------
    Nick McCoy
    ------------------------------



  • 9.  RE: JVM Startup Tips/Tricks (10.1.1)

    Posted Fri October 02, 2020 12:33 PM
    Your main issues is <100ms database latency. You must bring that down by at least an order of magnitude. Content Manager is quite chatty with the Content Store, as everything flows through it. You should mirror your CS DB around the globe as well.

    ------------------------------
    Dariusz Danielewski
    ------------------------------