Platform

Platform

A place for Apptio product users to learn, connect, share and grow together.

 View Only

Pivoting from Manually loaded data to Automated feeds

By Brad Giles posted Fri March 18, 2022 02:27 PM

  

Overview


This post provides guidance on how to minimize "pivot pain" during deliveries or integration of new data into Apptio (TBM Studio).

Pivot pain
 is defined as the technical debt accumulated between manual ETL (done by the delivery team or TBMA) and fully automated data loads.

Standard Procedure

1)  Initial Upload of data into TBM Studio

  • Load initial table manually – done by Apptio Consultant
  • Load using same format as will be loaded programmatically. Consider the following:
    • File format (e.g. TSV)
    • Time format (How are dates represented in the data)
      • For Ad Hoc and Monthly frequency, load as such. For example, don't load a file with a year's worth of data with the months as separate columns, since this is NOT how the data will be loaded regularly. Load each month separately going back as far as you need to. This may seem like a pain, but it's better than versioning or having to pivot from a yearly table to loading monthly down the road.
      • For yearly (e.g. Budget) or quarterly (e.g. Forecast), also load as such. In this case, the months are probably called out as separate columns in the header.
  • Team vets data and starts to configure (normalize)
  • Once table formatting static (columns static, headers consistent, row data well known (e.g. numeric vs. label)), we target tables for pivoting. Shoot for 60-70% completeness of necessary columns/fields then automate.
    • Load these files in small batches by the type of data (e.g. Cost Source data, Fixed Asset data, etc.)
    • Usually we start with Cost Source related tables such as the Chart of Accounts and General Ledger
    • Typically we load the next months’ worth of data following the initial (manually) uploaded data

Note: If you let this go longer than 30 days or so from kickoff, you are probably looking at some serious pivot pain.

2) Programmatic Upload to TEST table(s)

  • For target tables, load using same name with “ TEST” tacked on to the end (into the same category/node)
  • Data should be loaded using the same format as was loaded manually
  • Load only one months’ worth of data and check in each table. This will load the data as the “Initial Upload” and will allow others to manually validate the data in the next step
  • Additional data will be loaded into the target table (not the TEST tables)

3) Manual validation of TEST table data

  • Manually compare the target tables and the TEST tables
  • Only minor differences should exist:
    • Column headers should match
    • No additional/missing columns
    • No additional rows at the top of the table (otherwise Apptio won’t know which row contains the headers)
    • Same data type in columns (e.g. numeric vs. label)
  • Correct any issues and reload data into the TEST tables. If necessary, document what corrections will need to be made to the configuration of the target tables.
  • Repeat until no differences exist (or the changes that require reconfiguration are well-known)

4) Pivot

  • Programmatically load data into the tables manually loaded in Step 1.
    • You do not (necessarily) need to re-load data already loaded, just enough to test that the automated loads work. For example, if May data was loaded, and June is ready to load – load the June data programmatically.
    • If the initial upload (data loaded at the beginning of time) is in a different format, this may produce unexpected results.
      • This often manifests itself within TBM Studio as a "Could Not Load Table From Disk" (CNLTFD) error in the destination table for the month where the data format changed.
      • In some cases, this may work (e.g. XLS to CSV), but often it will not. If this occurs, reload the initial data set at the beginning of time (the Initial Upload) and re-check. In some cases, tables must be recreated so that the initial upload data types matches the regular (e.g. monthly) load format… e.g. JSON (initial upload) --> JSON (additional months). This could require re-creation of synthetic columns and re-mapping/appending of tables to master or pre-master data sets.
      • Workarounds Include:
        • Versioning the second month
        • Appending new data load into the table with the data originally loaded (initial upload)
        • See the following Community Article for more details.

5) Pivot corrections and validation

  • Make any configuration changes due to differences in the programmatic data loads (these should be minor, if any), and check these in
  • If there are only minor differences in the data that result in no/minor re-configuration, accept the changes and check the table(s) back in
  • Validate that additional data uploads do not result in any errors for the month(s) loaded
  • Continue to load the data based on the refresh frequency, until you are caught up to the current month
    • For monthly data (e.g. GL), load for each month from the start of the project
    • For Ad Hoc data, load the data at the beginning of time (or for whichever months you want the data to change)

6) Schedule future data loads based on upload frequency defined for each table


#Datalink
0 comments
3 views

Permalink