Cognos Analytics

Cognos Analytics

Connect, learn, and share with thousands of IBM Cognos Analytics users! 

 View Only

Replicating Framework Manager Star Schema Groupings within Data Modules Using Shortcuts

By Marc Reed posted Thu July 06, 2023 03:09 AM

  

Star Schema groupings are available in Framework Manager. A star schema grouping consists of shortcuts for a fact table and the associated dimensions. Star schema groupings are used to overcome several data modelling issues, including:

  1. Resolving ambiguous joins between dimensions.
    Consider a fact less query that only contains dimensions. Through which fact table should the dimensions be joined?
    A star schema grouping resolves this issue. The report author chooses the dimensions from the star schema that they wish to use join the dimensions together. The report author explicitly determines which of the multiple join paths they wish to use. Without a star schema grouping, the alphabetically first fact table is used when there are
    multiple join paths.

  2. Users can easily see what dimensions relate to which facts.
    How does a report author know which dimensions can be used to analyse a particular fact table? A star schema grouping puts the fact table and dimensions into a single folder. As these are shortcuts to the underlying facts and dimension objects, Cognos knows that these
    are the same dimensions if they are used in a multi-fact query.

Let’s look at how these issues relate to a Data Module. 

I have used the Great outdoors data module as a starting point for this discussion. Out of the box this data module looks like this:

A single Sales fact table and its associated dimensions. Being a single fact table this cant be used as is for this demonstration as we need multiple fact tables.

In this Data Module the Returns table is joined to the Sales table on Order Number. This join does not take into account the Product being returned.

This Returns and Sales join results in some errors when this data module is used by report authors:

The Return Quantity is double counted for every line in the Order Number. This data module should really be two fact tables - one for Sales and another for Returns. 

I have taken this Data Module and made it look like this:

An aside: In no way should this be thought of a complete and correct remodel of this Data Module. This is just a useful set of facts and dimensions to demonstrate the purpose of this blog post - multiple fact tables with different dimension granularity. The above remodel does not allow us to analyse returns by all of the available dimensions.  A simpler solution to remodel and give correct results would be to change the cardinality of the join between Sales and Returns with a 1:M between Sales and Returns and use Column Dependency to avoid double counting. In a returns query Sales would then act as a snow flake dimension. Back to the purpose of the post! 

Returns is now treated as a fact table. The data module now has two fact tables – Sales and Returns. The direct join between the two fact tables has been removed.

The existing product dimension now joins to both fact tables.

A new derived dimension Order Number has been created. This is a list of Order Numbers. This has been created by using a view on the Sales fact table.

It is important there is only a single row for each order number within the derived dimension to avoid double counting when joining to a fact table. To get a single row for an Order Number a column dependency has been created.

The data grid shows how there is only a single row for each order number.

If I recreate the report from the start of this exercise (the one with the repeating Return Quantity) against the updated data module, the report shows:

I can see the report is now working correctly. The Return Quantity is against the correct Product, and I have avoided any double counting. Cognos is performing a multi-fact query to join these two fact tables together on the common dimensions.

But by resolving the data issues I have created some issues around multiple facts. When an author uses the Data Module, how do they know which dimensions can be used to analyse the Sales or Returns data?

I know that Returns can only be analysed by Order Number and Products, but how would a typical consumer derive that from the above tree view of the Data Module?

If the consumer creates a fact less query – such as Order Number and Product Number – what fact table is used to join those two dimensions. Is the question being asked “what products are returned against order numbers?” or “what products are sold in order numbers?”. The consumer has no way of influencing which fact table is used to join these two dimensions. 

If I create a fact less report, then I see this.

I can see this must be Returned products by matching the previous examples, but what if I wanted sold products instead?

When there is an ambiguous join path between dimensions Cognos will use the alphabetically first fact table. In this example the Returns table is used. Within Framework Manager I would resolve these issues with a Star Schema Grouping. But Star Schema Groupings are not available within Data Modules – what can I do instead? 

I can get a very similar behaviour within Data Modules using shortcuts.

A Folder For Each Fact

The first step is to create a folder for each fact table. This should be a 1:1, a folder should only have one fact table within it.

A Dimensions Folder

Create a folder for all the dimensions and place all the dimensions within it.

As this is less useful for the consumer this is placed at the bottom of the Data Module. These are still useful as they could be used for building prompts.

Shortcuts For Dimensions

For the dimensions that are appropriate to each fact, create a shortcut and place the shortcuts within the appropriate folder.

For example, for the Returns fact, the appropriate dimensions are Order Number and Products. I have created shortcuts for these two dimensions, using the table option:

And then placed these shortcuts within the appropriate folder:

The Returns folder now contains the Returns fact table, and the shortcuts to the Order Number and Product dimensions.

Joins to the Shortcuts

Within the relationships diagram:

The Returns fact is joined to the Product and Order Number dimensions, and not the shortcuts. I need to remove the joins to the dimensions and instead join to the shortcuts.

Joins to shortcuts are created just like any other join.

Repeat For All Facts

Repeat these steps for each fact. Within this example, the steps are repeated for the Sales fact table.

The diagram shows that the facts are only joined to the appropriate dimension shortcuts. The original dimensions are no longer joined to any facts.

Dimensions used by multiple fact tables have multiple shortcuts. Two Product shortcuts and two Order Number shortcuts can be seen in the above diagram.

Consumer Experience

When the Data Module is used the consumer now gets a much better experience. For example, within Reporting the data module looks like:

The consumer can now easily see what dimensions can be used to analyse the facts. 

By creating an Order Number and Product Number report using the items from the Returns folder, the dimensions are joined by the Returns fact. The same is true using items from the Sales folder.

The consumer uses the appropriate folder for the question they want to ask.

In the above output, compare the row for Order Number 100003. Note the different products. By using the data items from the appropriate folder, a different answer is shown.

If the consumer wants to answer the questions “what Products were returned in each order number” then they would use the data items from the Returns folder. If the consumer wants to answer the question “what Products were sold in each order” then they would use the data items from the Sales folder.

So far everything we have seen could have easily been replicated by just creating copies of the dimensions and not shortcuts. But by using shortcuts Cognos knows that the Order and Product in a multi-fact query are the same underlying dimension. When a multi-fact query is created, then these dimensions are used to stitch the answers together.

If Sales Quantity and Returns Quantity are added to the list on the right, then the correct multi-fact query is created:

This would also work if these measures are added to the Returns list (on the left).

If simple copies of the dimensions were used, then the above report would look very different and would not be correct.

A Note About Identifiers

Within a data module each table must have its own unique identifier. In this example shown there are now three tables for Product – the original table and two shortcuts. All three must have their own unique identifier. (Within Framework Manager all three Product objects could have the same identifier as a Framework Manager Namespace is used to allow repetition of Identifiers.)

Give some thought to the identifiers. One convention is to use the role as a suffix to the identifiers, for example:

  1. PRODUCT
    As the identifier for the original Product table.
  2. PRODUCT_RETURNS
    As the identifier for the shortcut in the Returns folder
  3. PRODUCT_SALES
    As the identifier for the shortcut in the Sales folder.

Navigation Paths and Shortcuts

Within CA11, shortcuts do not inherit navigation paths of the underlying objects. If you created a navigation path on Products, then you would need to create similar navigation paths on the Product shortcuts too.

2 comments
36 views

Permalink

Comments

Fri July 07, 2023 10:06 AM

I haven't spotted any issues with renaming shortcuts.

The question is should you create a shortcut or alias for you different roles of the date dimension.

I think IBM sees the key difference between the two being the joins they have between the facts - a shortcut will inherit the existing joins of the underlying table, an alias should have different joins.

The tricky part is what do you want to happen in a multi-fact query. You want the Order Date and Delivery date to not be seen as the same dimensions.

So you definitely need two distinct dimensions for Order date and delivery date.

In my technique I create the joins between the facts and the shortcuts, and it is this what gives the user the control of the fact less query.

If I was given this problem I think I would have two underlying date tables, one for order date and one for delivery date..Then create shortcuts to those two tables.

The question is - how do you create the two underlying objects.

  1. copies of the same calendar table.
  2. one original, order date, then a view of that as delivery date.
  3. I would avoid using a shortcut/alias of Order date to make delivery date.

If I was modelling I would have two underlying tables.

Then I could call the columns in one things like Delivery Year, Delivery Month and the columns in the other things like Order Year, Order Month.

This makes it obvious to the user that when there is a date on the report what it actually is.

Fri July 07, 2023 09:46 AM

Thank you! Your article will help us with our current project.

Do you also have recommendations in case a dimension is present in more than one role, e.g. an order date and a delivery date? As far as I remember the name of a shortcut cannot be edited.