Reports, dashboards, and stories need data. This data might be made available to you by an administrator who creates packages, or you may have uploaded your own data from an Excel file.
Cognos Analytics version 11.0.4 introduces a new type of gateway to data – they are called data sets.
Data sets are created from packages or data modules. Data sets can be used to gather a customized collection of items that you use frequently. As you make updates to your data set, dashboards and stories that use the data set are also kept up-to-date the next time you open them.
You define a data set by choosing one or more items (columns) from a package or data module, and apply filters to reduce the data. You’re essentially specifying the rectangle of columns and rows of data that you need. The data is extracted and stored within the Cognos Analytics system as explained here
Because the data is cached, data sets can improve query performance and reduce the workload on your database(s). Here’s some reasons to use a data set
• improve query performance if your database is slow
• reduce the load on an overworked database (especially during peak periods)
• retain a version of the data at a specific time
For data sets created from relational packages or data modules, you have the option to Summarize detailed values, suppressing duplicates.
When you use this option, measure values will be aggregated to the lowest grain that is explicitly included in the data set. For example, your data warehouse stores millions of records pertaining to each transaction where units were sold, but you’re only interested in analyzing the total sales per region – if your data set contains only the Region and Units Sold columns and you use the option, the data set will only contain as many rows as there are regions. Notice in the following screens that the values in the Quantity column are much larger when the option is enabled – the data set will have much fewer rows since the quantities will be rolled up into each distinct combination of retailer and order method type where units were sold.
The benefit of using this option is that it can condense the data set into fewer rows, and all else equal fewer rows lead to better performing reports and dashboards!
Do not use the option if information in the details is important for your analysis.Refreshing your Data Set
Through the Cognos Analytics portal, you can change a data set’s columns and filters anytime you want. You can also update its data either on-demand or schedule the refreshes to occur automatically including weekly, daily, hourly or every X minutes.
The information within a data set is pre-calculated and pre-aggregated. What you see is what you get from the preview area when defining the data set to what gets stored on the system. If a package or data module truncates a column’s values and you create a data set with it, the truncated values (as opposed to the original values) will be extracted and stored. Transformations that take a long time can be completed overnight so they’re ready-to-use first thing in the morning.Data Sets from Data Sets
Data sets can be sources to data modules and since you can create a data set from a data module, you can (indirectly) create a data set from one or more other data sets!
Each data set enables you to further combine, summarize, and pre-calculate data that will answer to your team’s questions. With this approach you can summarize summaries to whittle down trillions of records from your Hadoop system into information that’s better suited for ad hoc exploration.Release Them Into the Wild
Cognos Analytics data sets are insulated from all other systems including the underlying database so your database administrator won’t be worried about runway queries when they are being consumed. The size of a data set is easily controlled with filters. Administrators can limit the size of any single data set and the total volume that any one user can occupy on the system. Administrators can also control who is permitted to create data sets – perhaps you want to start by enabling a small group of power users before expanding.
If you keep a data set small, you can rest assured that no matter what someone does with it in a dashboard they will get snappy response times.Moving Data Sets between Cognos Analytics Environments
Data sets can be transferred from one Cognos Analytics environment to another. If you want to deploy into production some data sets that you tested in a staging environment, simply create a deployment in the staging environment that includes the folder(s) containing the data sets (select the deployment option "Include report output versions"* if you want the extracted data included otherwise only the metadata will be) and then import the deployment into the production environment.Data Sets Replace the Snapshot Mode of Data Modules
Prior versions of Cognos Analytics offered a snapshot mode option within a data module that would extract all the data. This snapshot mode is no longer available as it’s been made obsolete by data sets. Data modules that were set to snapshot mode in a prior version will upgrade into “live” / ”regular” data modules in 11.0.4 and higher.
Data sets have the following advantages over the now deprecated snapshot mode:
• Data sets give you the option of extracting summarized or detailed values.
• Data sets store data as a single table whereas snapshot modules store separate files for each table in the module. All else equal, a query that does not require a join will be faster.
• A subset of a data module can be extracted into a data set.
• Data set refreshes can be scheduled.
*As of 11.0.5 use the option Include uploaded data