I think I would start with the first option (calculations in data module only) and check the performance. Of course various factors may change that point of view (noumber of rows/cloumns or complexity of further modelling/reporting).
Original Message:
Sent: Wed July 12, 2023 09:10 AM
From: Gustavo Andrade
Subject: Get the difference of the variables between 2 tables
In your view, what would be the most appropriate way to get the data for "Customers_Not_Special"?
By SQL Join between CLI_ESP and CLI_TOTAL or by calculating each column (Revenue, etc.)
------------------------------
Gustavo Andrade
Data Analyst
Original Message:
Sent: Wed July 12, 2023 06:17 AM
From: Philipp Hornung
Subject: Get the difference of the variables between 2 tables
Hi Gustavo,
In relational modelling you have several options:
- Virtual only: A data module with a union of both tables (a join would probably do the same job, personally I prefer union)
- No redundant persistence of data and no read time lag
- Data set: A data set with the union (or join) of both tables. On top of the data set I recommend a data module for flexibility and possible enhancement reasons.
- Probably best reading performance
- Table: Additional table filled by the ETL process like you suggested it.
- Good performance and transparency (ETL process for all tables).
Best regards,
Philipp
------------------------------
Philipp Hornung
Business Intelligence Manager
Techniker Krankenkasse
Hamburg Germany
#IBMChampion
Original Message:
Sent: Tue July 11, 2023 04:42 PM
From: Gustavo Andrade
Subject: Get the difference of the variables between 2 tables
Hello,
I have 2 tables
1 with 3k of records called CLIENT_ESPECIAL
1 with 5M records called CLIENT_TOTAL (including CLIENT_ESPECIAL data).
Both are summarized by branch, containing only the amounts and sums of the main columns.
In my panel I always need to display the indicators by Special Customers and Non-Special Customers, in this case I understand that it would be a simple subtraction when displaying the indicator, for example in the Revenue column:
Customers_Not_Special.Revenue = Customers_Totals.Revenue - Customers_Special.Revenue
In this case, the best way to do it is by calculation when displaying the indicator in the Dashboard/Report, or better to create a new table with this difference "Clients_No_Speciais"? Or better to unify the 2 tables (with a TYPE column with value CLI_ESPECIAL or CLI_NOT_ESPECIAL) and use the result of the union to create the dashboard.
Below is an example of the current structure and how I imagine the ideal scenario
| TABLE_CLIENT_ESPECIAL |
| DATABASE | NUM_BRANCH | TYPE | REVENUE | QTY_CUSTOMER |
| 2023-04 | 10014 | CLI_ESPECIAL | 125256 | 25589 |
| 2023-04 | 10028 | CLI_ESPECIAL | 856898 | 38158 |
| | | | |
| TABLE_CLIENT_TOTAL |
| DATABASE | NUM_BRANCH | TYPE | REVENUE | QTY_CUSTOMER |
| 2023-05 | 10014 | CLI_TOTAL | 1284689788 | 1286589 |
| 2023-05 | 10035 | CLI_TOTAL | 2599567889 | 4544578 |
| | | | |
| EXPECTED_CLIENT_TABLE |
| DATABASE | NUM_BRANCH | TYPE | REVENUE | QTY_CUSTOMER |
| 2023-04 | 10014 | CLI_ESPECIAL | 125256 | 25589 |
| 2023-04 | 10028 | CLI_ESPECIAL | 856898 | 38158 |
| 2023-05 | 10014 | CLI_NOT_ESPECIAL | 1284689788 | 1286589 |
| 2023-05 | 10035 | CLI_NOT_ESPECIAL | 2599567889 | 4544578 |
#Cognos Analytics with Watson
------------------------------
Gustavo Andrade
Data Analyst
------------------------------