Code Compaction optimizations:
Optimizations strategies for code compaction that are spec'ed out.
a) Loop Rereolling in straight line code.
In this optimization isomorphic operation in a Control flow graph are rolled out in a loop reducing code size.
To identify isomorphic operations similar instructions are identified with the following equivalence relations.
|
instructions are equivalent if they have
|
- same opcode.
- equivalent types
- corresponding operands have equivalent types.
- Type equivalence for function calls means that they have identical function types/identical
return types and identical list of parameters.
Example-1:
Code marked in blue are isomorphic operations.
loop: iv0 = phi [0, pre], [ivn, loop] m0 = mul factor, iv0 idx0 = gep A, iv0 store m0, idx0 iv1 = add iv0, 1 m1= mul factor, iv1 idx1 = gep A, iv1
|
store m1, idx1
iv2 = add iv1, 1
m2 = mul factor, iv2 idx2 = gep A, iv2 store m2, idx2 iv1 = add iv0, 1
ivn = add iv0, 3
|
cmp = icmp lt ivn, SIZE br cmp, loop, exit
Replaced with loop: iv0 = phi [0, pre], [ivn, loop] m0 = mul factor, iv0 idx0 = gep A, iv0
|
store m0, idx0
ivn = add iv0, 1 cmp = icmp lt ivn, SIZE br cmp, loop, exit Example-2: static void foo (struct A a[], void *state) begin:
|
vst ( state , a.v[0]);
vst ( state + 8, a.v[1]);
vst ( state + 16 , a.v[2]); vst ( state +32 , a.v[3]);
end: Replaced with static void foo (struct A a[], void *state) begin:
|
for ( i = 0; i <= 4; ++i)
|
vst ( state + i*16, a.v[i]);
end:
b) Software de-Pipeline.
Decision on software pipeline can be done at LLVM IR Level by the following.
- For given loop identify loop carried dependencies.
- Find distance between loop carried dependencies.
- If the latency of loop is greater than distance vectors, then do the software pipeline otherwise don't software pipeline.
- clone the function.
- force software pipeline for the loops in the functions.
- if the cloned function decreases the code size with trade off with performance, then keep
clone function
- otherwise keep original function.
c) Inline heuristics.
Inlining the callee function at caller site increases the code or might reduce the code size by doing context sensitive optimization like dead code elimination, copy propagation and constant
|
propagation.
To decide inlining whether increases code size or decrease the code size can be done with the following heuristics:
|
- Clone the caller function.
- inline the callee at the caller site
- check whether it increases code size or decide the code size
- Decide to keep clone function or the original functions.
- If clone function decreases the code size, then keep clone function otherwise keep original
function.
- Always Inline the callee if there is only one call site of callee in the function.
d) SESE and SEME regions outlined.
SESE and SEME regions can be identified on LLVM IR infrastructure. Based on the matching SESE and SEME control region flow graph with isomorphic operations, these regions can be merged and hence reduces the code size. They can also be outlined in a new function with input parameters as Live-in to SESE and SEME regions and output parameters are Live-out of SESE and SEME regions. These reduces register pressure and code size.
e) Loop unrolling strategies:
For given loop in the function identify Loop carried dependencies.
• Identify distance and direction vectors.
|
• Distance with +ve directions give unrolling factor.
|