IBM Apptio

IBM Apptio

A place for Apptio product users to learn, connect, share and grow together.

 View Only
  • 1.  Is there an impact to performance for duplicate relationships?

    Posted Thu August 13, 2020 01:38 PM

    In the example below we see an allocation from Cost Source to Labor.  Data Relationships include Temp Labor and Cost Center which are both in the Cost_Source_Labor Key.  Is performance impacted by including the Account Subgroup in Cost Center in defined relationships?  If so, how impactful?

     

     

    @Debbie Hagen

    @Guillermo Cuadrado

    @Lauren Pagan








    #CostingStandard(CT-Foundation)


  • 2.  Re: Is there an impact to performance for duplicate relationships?

    Posted Wed August 19, 2020 03:49 PM

    tagging some people Michelle for possible replies @Andrew Mulvaney @Kate Lozer @Michael Darragh� @Ashley Peterson @Adam Moretz� @Fred Salatino @Chris Davidson 


    #CostingStandard(CT-Foundation)


  • 3.  Re: Is there an impact to performance for duplicate relationships?

    Posted Thu August 20, 2020 02:39 AM

    I would add @Gulcin Menekse, our wonderful CSM and an expert on sparseness. Also, @Sheridan Coulter, who's helped us a few times on topics related to performance.


    #CostingStandard(CT-Foundation)


  • 4.  Re: Is there an impact to performance for duplicate relationships?

    Posted Wed August 19, 2020 04:47 PM

    Hi, Michelle. There are many factors that can impact performance, but there are some tricks to seeing the impact in real time. In your case, making your allocation parameters more granular probably means you are making the allocation more performant as you are spreading less rows across a many-to-many allocation. Try this before and after your change:

    1. Open your Cost metric
    2. Switch to Diagram view through the drop down at the top
    3. Zoom in on your Cost Source to Labor allocation at the bottom of the model
    4. Click on Cost Source, Labor, and the line between them noting the row counts on each one.

     

    When row counts between objects go up, performance takes a hit. If you have 3K rows in your Cost Source going to 20K in Labor that's a worst case scenario of 3K x 20K for the allocation line between them if you did a pure even spread. That would significantly slow things down. So the more granular you make your allocation strategy, the fewer rows will spread across fewer rows.

     

    Try adding your Subgroup, and then repeat the steps above. Did your allocation line row count go up or down? How much it changes will relate to how much performance is impacted. Let us know!


    #CostingStandard(CT-Foundation)


  • 5.  Re: Is there an impact to performance for duplicate relationships?

    Posted Wed August 19, 2020 06:27 PM

    The row counts will be the same since we are simply defining the same relationship twice.  So it won't change the row counts but I would thing the query itself would be more complex with the duplicates because they are already defined once (and hence process the query 'where {Cost Source Master Data.Cost Center}={Labor Master Data.Cost Center}'  and then would process the same thing again as 

    'where {Cost Source Master Data.Cost Source Labor_Key}={Labor Master Data.Cost Source Laber_Key}.'


    #CostingStandard(CT-Foundation)


  • 6.  Re: Is there an impact to performance for duplicate relationships?
    Best Answer

    Posted Thu August 20, 2020 07:50 AM

    Fred (and I) initially assumed you were interested in adding an additional field to your existing key column definition. As he noted, this normally reduces row count of the assignment ratio table (the table produced by combination of all relevant source and destination table rows).

     

    But it sounds like you want to add an additional table column to your existing data relationship, for example to rise from 6 columns (seen in your screenshot, minus Account Subgroup) to 7 columns (including Account Subgroup)?

     

    If so, you are correct: This will add computational complexity to the allocation, although probably adding only a few seconds at the most of additional calculation time (per project time period), owing to how your data relationship is already constructed.

     

    Also, you can roughly gauge the time difference yourself: Make a small arbitrary change to the data table (such as adding a new column) and use Fred's idea of viewing the Cost model diagram - time how long it takes the allocation (Cost Source to Laborto display its value. This is your baseline time. Now make the Account Subgroup change you proposed, and refresh the Cost model diagram view. Again time how long it takes the allocation results to display on the diagram. Multiply the time difference by the number of time periods open in your project for an estimate of the calculation impact.


    #CostingStandard(CT-Foundation)


  • 7.  Re: Is there an impact to performance for duplicate relationships?

    Posted Thu August 20, 2020 05:37 PM

    Thank you @Chris Davidson.  I'll post my results here when I finish!


    #CostingStandard(CT-Foundation)