Planning Analytics

Get AI-infused integrated business planning

View Only

Back to Blog List

Time to Stop Using Expand?

By Declan Rodger posted Sat September 11, 2021 07:38 AM

Recently, I have been working on a project where I need to process incredibly large volumes of data. In IBM Planning Analytics/TM1 we can leverage TurboIntegrator (TI) Processes which are incredibly fast at Extract, Transform and Load (ETL) tasks. But, no matter how fast a process is - the more data you throw at it, the longer it will take.
It doesn't take a genius to know that users want to see their results quickly and it's our job as developers to streamline these processes to make them as fast as possible and get the users analysing instead of waiting.

On this particular project, I applied my usual bag of tricks to make the processes run faster. The results were good, it processes about 20 times more data than their old application and ran in around a third of the time. But I had a gut feeling it could still be faster, so I started looking at the things I wouldn't usually consider. I went through the processes line by line and function by function. Swapping out every function in there with an alternative method and testing the speed impact. Most changes were negligible or actually made things worse. But I found a couple of surprising results that saw big improvements, things that I hadn't seen mentioned before when talking about processing time.
One of them was the Expand function.

What is The Expand Function?

The Expand Function in TI processes is used to convert Variable names enclosed in % symbols to the contents of the variable at run time.

So if I had a process that had the variable vsCountry; I could use the below code (on my metadata or data tabs) to export the Country variable for the data record that is being processed:

TextOutput ( cFileName, Expand ( %vsCountry% ) );

The Expand function also converts its entire contents into 1 concatenated string, this becomes particularly useful when trying to do something like exporting a large file that contains both String (text) and Numeric data variables. The TextOutput and AsciiOutput functions only accept Strings, so you can either convert all of your Numeric variables to String using a function like NumberToString or use Expand to do it for you... so the 2 examples below would do the same thing:

TextOutput ( cFileName, vsCountry, vsStore, vsProduct, NumberToString ( vnVolume), NumberToString ( vnRevenue), NumberToString ( vnCost ), NumberToString ( vnProfit ) );

TextOutput ( cFileName, Expand ( %vsCountry%, %vsStore%, %vsProduct%, %vnVolume%, %vnRevenue%, %vnCost%, %vnProfit% ) );

And likewise, the example in the Expand reference guide entry from IBM is showing how to use Expand on an ODBCOutput to insert a row of data into a database table. But this also has an alternative as shown below where you concatenate the string up manually:

ODBCOutPut( 'TransData', Expand( 'INSERT INTO SALES( MONTH, PRODUCT, SALES ) 
VALUES ( "%V0%", "%V1%",%V2% )' ) );

ODBCOutPut( 'TransData', 'INSERT INTO SALES( MONTH, PRODUCT, SALES ) 
VALUES ( "' | V0 | '", "' | V1 | '", | NumberToString ( V2 ) | ' )' );

You can start to see how the Expand function can often make the code easier to read and hence why it is used extensively in TI Processes.

The Problem

When doing my analysis, I went through a TI process that exported circa 1 billion rows and approximately 30 columns (with a mixture of String and Numeric data.)
Initially, I had used the Expand function to create my export string, so I tried a simple swap and just using the variable names directly with NumberToString where required.
The altered process exported in approximately 50% of the time it had been taking with Expand.

After seeing the saving in my real-world example, I tested the theory by running some extremely simple TI code:

cOutputFile = 'C:\test.txt';

sString = 'String';
nNumeric = 1;

iCount = 1;
iMax = 10000000;
While ( iCount <= iMax );
	#TextOutput  ( cOutputFile, Expand ( '%sString%, %nNumeric%' ) );
	#TextOutput  ( cOutputFile, sString, NumberToString ( nNumeric ) );
	#TextOutput  ( cOutputFile, Expand ( '%sString%' ) );
	#TextOutput  ( cOutputFile, sString );
	#TextOutput  ( cOutputFile, Expand ( '%nNumeric%' ) );
	#TextOutput  ( cOutputFile, NumberToString ( nNumeric ) );
	#TextOutput  ( cOutputFile, Expand ( '%nNumeric%, %nNumeric%, %nNumeric%' ) );
	#TextOutput  ( cOutputFile, NumberToString ( nNumeric ), NumberToString ( nNumeric ), NumberToString ( nNumeric ) );
	iCount = iCount + 1;
End;

I unhashed each TextOutput one at a time and executed the process 10 times for that snippet, then compiled an average of the results.

Variable Types	AVG Time with Expand (s)	AVG Time without Expand (s)	Difference (s)	Difference (%)
1 String, 1 Numeric	44.64	13.23	31.41	70.36%
1 String	10.10	8.61	1.49	14.72%
1 Numeric	43.22	12.07	31.15	72.07%
3 Numerics	105.32	19.87	85.45	81.13%

When exporting 1 string and 1 numeric column; the average time taken for the process was 44.64 seconds when using the Expand function compared with only 13.23 seconds without it. The difference of 31.41 seconds equating to 70% of the runtime.

The test when only exporting a string column was a bit quicker without the Expand function, but with a much less significant time saving, this indicates that the big difference here is derived from the way that the Expand function converts a numeric value into text being less efficient than the simple NumberToString function.

The test is a very simplified example of what you are likely to be doing in the real world but the results highlight a promising potential time saving for many systems and processes.

Summary

The Expand function is extremely useful, I will probably continue using it for smaller to medium data quantities where the difference would be negligible. There are also other use cases for Expand that I haven't discussed here, such as utilising it to load data from a dynamic data source (see here.) So it certainly has its place.
But, if you have long-running TI processes that use Expand on large volumes of data - I would certainly suggest doing some trial and error testing to see whether getting rid can win your users back some time.

#PlanningAnalyticswithWatson

3 comments

95 views

Permalink

https://community.ibm.com/community/user/blogs/declan-rodger/2021/09/10/time-to-stop-using-expand

Comments

Vitalij Rusakovskij

Mon April 14, 2025 04:13 AM

Hello Roger,

with what versions have you tried this?
I'm doing a test now with the current tm1-version and it seems that there is no real difference, at least for my test data set with about 70 mb.

How big is/ was your ASCII-file you went through?

Thanks in advance.
Vitalij

Chaturvedi Musham

Thu May 19, 2022 12:19 AM

Great analysis. Thank you for sharing!

Jesper Poulsen

Mon September 27, 2021 09:41 AM

Very interesting and thanks for sharing.

Planning Analytics

Planning Analytics

Time to Stop Using Expand?

By Declan Rodger posted Sat September 11, 2021 07:38 AM

Permalink

Comments

Additional
Resources

Office

Quick Links

Planning Analytics

Planning Analytics

Time to Stop Using Expand?

By Declan Rodger posted Sat September 11, 2021 07:38 AM

Permalink

Comments

Additional Resources

Office

Quick Links

Additional
Resources