Cognos Analytics with Watson

 View Only

Synchronizing data source metadata changes to update data module metadata

By IAN HENDERSON posted Fri March 11, 2022 09:46 AM

  

Updating metadata properties

Starting in Cognos Analytics (CA ) 11.2.2, you can more easily update the metadata of your modules to match what is stored in the loaded schemas, uploaded files, and data sets which your modules use.

The refresh metadata functionality determines which property values of the data sources used by the selected objects have been changed and attempts to update the data module to pick up those changes.

The metadata changes which will update are the data type, technical data type, aggregate, and usage properties.

These are properties of query items and stand-alone calculations.

By making a selection, you can update the query items of the objects which are encompassed by the selection by choosing the refresh properties menu item on the context menu. You can also select multiple objects and update the metadata. The objects which will allow you to select and refresh properties are the module, calculations, query items, query subjects, views or similar objects, and folders. Folders within a query subject will not be valid for this action.

The existing method, manually editing the expressions for each affected query item, is too lengthy and kludgy.

The refresh metadata will also refresh the metadata of objects which are used by the objects of the selection.

The new update metadata functionality works similarly to the Framework Manager (FM) update object functionality.

Terminology

In this paper, the term data source refers to either a data base schema, an uploaded file, or a data set. A data set is a single table data base stored by Cognos. An uploaded file will generate something similar, with a table for each worksheet in the XLS-type file or, if the file is a CSV, a single table. In addition, calculation refers to a stand-alone calculation. Query item is a column from a table in the data source.

Background

When either a schema is loaded, a file is uploaded, or a data set is created, metadata describing the columns and tables of the data source is generated. This metadata includes things such as the name of the object and, in the case of a column in a data source, the data type of the column. In addition, Cognos attempts to determine the usage of the column and its aggregation.

This metadata is stored and is used by Cognos in its query generation. This can help avoid unnecessary metadata callbacks to the data source.

When something from a data source is added to a data module, this metadata is added to the data module. You can use the metadata and alter it in the data module.

Sometimes it is necessary to change properties in the data source. Some people have made metadata changes to their data sources and want to be able to propagate those changes to their data modules.

There is an existing method to update the metadata of a module. This method would require that for each query item or calculation which used a column in your data base whose metadata had changed, or which used an expression which referenced such a query item or calculation, you open the expression editor, enter a space, usually at the end of the expression, which would force the expression to change, and then save the changed expression. The action would call the data source, get the metadata for the column, and modify the metadata as stored in the module.

The refresh properties functionality makes the business of doing the updating faster.


How refreshing metadata works

The update functionality will examine the objects which were selected for updating. It will identify query items which exist in their expressions and attempt to identify if the metadata properties of those query items or the query items which are used in the query item expression, have changed. If so, then the update will update the properties.

As part of the examination, the update will walk back through the path of query items which are used by the selected object to get to the query items which exist as part of the data source. Thus, not just the query items which have been selected, but also the query items that are used by the selected query items, will be updated.

For example, assume you have a view. In it is a query item. The expression of the query item is a reference to a query item in another query subject. In this picture, the query item Employee Name in employee name - View is a reference to a query item in employee name, which in turn, is a reference to Employee Name in the query subject Emp Employee Dim.

Chain of query subjects illustrating how the selection will determine what metadata is updated

If you selected employee name - View and chose refresh properties, the update functionality will determine that the expression of the object refers to employee name. It will then try to determine if employee name refers to another query item in the module or if it refers to a column in a data source.

If the former, then it will continue to walk down to the query item referenced by the query item. If the latter, it will compare the metadata as stored in the module to that which is stored in the data source. In our case employee name refers to the query subject Emp Employee Dim.

If the metadata is different, then the property values of the data source object will be used to update the query item’s properties. Those changes will be propagated to each query item. In this case, the metadata of Emp Employee Dim will be changed as well as that of employee name - View and employee name.

The selection affects the scope of the refresh of the metadata.

Assume that employee name was selected only. Refresh properties would update employee name and Emp Employee Dim. It would not update employee name – View.

If Emp Employee Dim was selected, the metadata for Emp Employee Dim would be updated. The metadata for employee name - View and employee name would not.

There are tradeoffs which need to be considered for each type of selection. A discussion of that is in the tradeoffs and considerations section.

You can undo the changes. When the update is complete an undo event is generated. If you press the undo button, then the metadata update will be reversed.

You can cancel a long running update task by pressing the cancel button. No changes will be applied to your module.

If the update process does not identify anything which needs updating it will display a message saying that the refresh was successful, however no undo item will be generated, as nothing was changed. This is a known defect.

Metadata which is stored in any of the 4 properties, or which are used to determine a property value will be updated during the refresh.

Workflow

If you have modules which use other modules, the technique to update the metadata is to start at the lowest level modules, update them, and work your way up through the modules. This is because objects from other modules exist as references and are read-only.

Illustration of a data module referenced within another data module

Tradeoffs and considerations

There are two approaches to performing the metadata refresh. They are module refresh and the selected refresh. In the former, you would select the data module root node and perform refresh properties. In the latter, you would select individual query subjects, query items, and calculations, or some set of them, and perform refresh properties.

The former will save time and will update everything in the module which has metadata different from their data source metadata.

You could accidentally update the metadata which you might want to preserve. It is possible that you want to preserve the usage or aggregation property as set by you in the data module. In some cases, this will be preserved during the metadata update. In others it won’t.

The latter approach of refresh on a selection of objects will be more precise. Although requiring more UI actions by the modeler, it is still faster than the old metadata refresh technique. It also requires that the modeler have a clear understanding where his metadata is being used, which arguably should be the case anyway.

Because you would need to know where a query item is used, and to select and refresh each item which uses the query item, this is where something like the FM object dependency feature would be helpful but it’s not on the list of FM feature gaps so don’t hold your breath.

As mentioned in the section How refreshing metadata works, if you have several objects which use the same source object, and you choose to use the selected refresh method, then you need to select and refresh each of the objects.


Column refresh

Metadata update does not update the list of columns used in a query subject. If a column has been added to the source table, then the new column needs to be dragged into the module. It will be added to the query subject at the bottom of the list of query items. You can move it to another position in the list of query items.

In addition, you can drag the entire table into the module to add the new columns to the query subject.

If a column has been removed from the data source the validation of the data module should identify any query item which refers to it as having missing references. This would require the modeler to update the expressions to refer to objects which exist.

Why two data type properties

Cognos stores two data type properties. One is technical data type, which describes the data type in a mapping to the data type of the data source object. It is located in the advanced properties section. The second is data type, which is a high-level classification of the data type, such as string, numeric, date or datetime. In most cases, the high-level data type is only the one which you need to be aware. An exception would be cases such as where you need to be aware of the size of the column, such as with text columns.

Here is an example. In this query item, the technical data type stores information such as the precision and scale.

This is an example of the mapping between the metadata properties of a column in a data base and those of the data type and technical data type properties in the data module.

This column has a data type of decimal, numeric precision of 19, and numeric scale of 2.

Illustration of a data module referenced within another data module



Here is the presentation of that in the data module. The data type is decimal. The technical data type includes the precision and scale information.

The metadata of a column as displayed in a data module
0 comments
36 views

Permalink