Cognos Analytics

 View Only
  • 1.  Calculated Columns - wow this is painful.

    Posted Thu May 18, 2023 02:19 PM

    Hello,

    I want to do a straightforward expression: but I am finding the CEL syntax impossible to penetrate, as there doesn't seem to be many examples about how to learn the expression editor syntax. Its all just typical engineering manuals:  here is a spoon, it is spoon-shaped. There are no contextual examples of how to use functions in the say the context of cereal.   

    Anyway, I want to sort my table called joined_data by the column 'date' which contains date data for each row, into reverse order, ie. the last date recorded is the first row in the newly calculated column. Lets call this CC: Daily_Close_to_Close_Returns. I then want to subtract the number in the first row in 'adj_close' column by the next row data in  the 'adj_close' column and then divide the output by the previous day's 'adj_close' number. and formate the output into a %.. so * by 100 ?   then i want this value to be inserted into the corresponding row which is the first row.  

    In Excel this formula is trivial  ((F2-F3)/F3)*100     (yes the data was already sorted in reverse date order) 

    #Frusted

    This experience feels very much Qlick, i.e you can't really get anything done without being a developer. i.e an IBM developer that costs $$$ 



     









    ------------------------------
    Lindsay Rex
    ------------------------------


  • 2.  RE: Calculated Columns - wow this is painful.

    IBM Champion
    Posted Thu May 18, 2023 04:18 PM

    It's hard to help here because I'm not sure which tool you're trying to accomplish this in. Since you mention tables, I'm assuming it's a relational source. Is this a Data Module or a Framework Manager Package?

    The second problem I'm running into is how do you want to handle dates with missing values? Based on what you said we could get this result, but it may not be correct. Notice the jump from Jul 16 to Jul 13, is this output acceptable? 



    This would be easier to accomplish on a dimensional or DMR model. In a purely relational source we need to make a two-step query. There are a few functions you can use to get the data from previous rows. In this case I think we would want moving-total([Measure],2) - [Measure]. Moving total sums the current and N rows. Subtract the measure and you get the previous row value. Then it's a simple matter of "([Measure] - [Prev Row]) / [Prev Row]" don't bother multiplying by 100, because we can use data formatting for that. 

    In order for this to work correctly the step 1 query should have presort set for the date field:

    In the query explorer drag in a new query and drag Step 1 to the right of it.

    Drag in the fields you need, then use Step 2 as a source for your list. Then set the sorting on Date to descending and you should have what you need:


    Don't forget to set the format on the Growth column to percent:


    About the comment about needing developers, it's definitely true that training on Cognos will help a lot. There are a ton of consulting firms, including the one I work for, that would be happy to help with both your training and development needs. 



    ------------------------------
    Paul Mendelson
    ------------------------------



  • 3.  RE: Calculated Columns - wow this is painful.

    Posted Thu May 18, 2023 08:21 PM

    Hi Paul,

    You zeroed in on my exact lack of expertise, namely understanding the differences between dimensional and relational data structures. After an educational discussion with chatgpt I got to this summary of the two: 
    ---------------

    Denormalizing data into a dimensional model involves duplicating data, which can increase storage requirements. However, the denormalization process is intended to optimize data retrieval, analysis, and reporting, rather than focusing on write-heavy transactional operations. By duplicating data, dimensional models simplify queries and enable intuitive exploration based on dimensions and hierarchies, leading to improved query performance and facilitating ad-hoc reporting.

    On the other hand, relational data models prioritize data normalization to reduce redundancy and maintain data integrity. This normalization process eliminates data duplication and ensures efficient write operations by avoiding the need to update multiple instances of duplicated data. Relational databases excel in read-and-write or transactional use cases where maintaining data consistency and integrity during write operations is crucial.

    Therefore, the choice between dimensional and relational data models depends on the specific requirements of the use case. Dimensional models are optimal for read-heavy workloads, analysis, reporting, and decision support systems. Relational models are suitable for transactional processing where maintaining data consistency and minimizing write-time overhead are primary concerns.

    --------------------
     
    So from your notes, and this deeper understanding, I can deduce that I probably don't need a Postgres relational database to store my data in, as my use case is specifically 'analysis' and 'read-only' and not 'transactional.'  I have also learned that my typical data analysis perspective and knowledge has been rooted in a dimensional data structure for the past 20 years, hence why I was finding it painful to work with a relational model as this was my first time architecting the SQL database. My actual goal was to not use excel to do the analysis data is over that 10,000-row mark so Excel struggles every time I do a new what-if update to the analysis. 

    So what I think I need to do is create a dimensional data structure at the ETL stage and just read the data into BI world, pre calculated, or 'denormalized' 

    Thank you for replying Paul.  

    Kind regards,

    Lindsay Rex

    FYI i am using postgres in a container, on my local pc, and connecting the 'cognus anaylstics on cloud' to the db using the 'data server connection.'  







    ------------------------------
    Lindsay Rex
    ------------------------------



  • 4.  RE: Calculated Columns - wow this is painful.

    IBM Champion
    Posted Fri May 19, 2023 09:20 AM

    ChatGPT has a really nice description there, and it's absolutely correct. A dimensional model repeats data, but it greatly simplifies the act of building complicated queries. Instead of using Data Modules, you may want to take the time to build a DMR (Dimensionally Modelled Relational) package in Framework Manager. 

    Consider this expression:
    tuple([Revenue],parallelPeriod([Sales (analysis)].[Time].[Time].[Year],1,currentMember([Sales (analysis)].[Time].[Time])))
    It's finding the intersection of [Revenue] and the member 1 year back from whatever the current contextual time period is. 

    So a year level crosstab might look like this:

    But since it's a dimensional source, we can expand one of the years:

    And if you only care about growth you don't need to see the other measures, you could put something else in the columns:

    Notice 2010 disappeared there, since there was no 2009 data growth is null.

    At this point growth can be considered a measure, and used as such. Take a look at this:

    Growth is now the default measure in the crosstab, Regions and Products are in the rows and columns, with both being sorted on growth descending. The user can select any year, quarter, or month they want, and growth will always be against the same period in the previous year. 

    While you're looking at making changes to the ETL, I also recommend going over the Kimball methodologies. The Data Warehouse Toolkit may be old, but it is the gold standard when it comes to building a data warehouse. (Suck it Inmon)

    Cognos works best with denormalized dim and fact tables designed as a star schema. It will also handle (if properly modelled) multigrain (sales data on a daily basis, but targets on a monthly), many-to-many facts, and non conformed dimensions.

    I strongly strongly recommend finding a partner to do either training or the initial data modelling, or both. A well built data model makes life infinitely easier. The Growth calculation above could just as easily have been defined in the model. A poorly built model can be incredibly slow and may even return incorrect values.



    ------------------------------
    Paul Mendelson
    ------------------------------