Originally posted by: billo
The basic premise of inline asm is to be able to write assembly code within your C or C++ program (using asm operands to connect the asm code to the parent program), delegate the translation of that assembly to the compiler and/or the system assembler, and consequently have the resulting machine code embedded within your program at the location you specified. The reasons for using inline asm can be many, but one reason can be to utilize special instructions are too new or too novel for either the compiler or assembler to know about. That seems like an impossible quandary: inline asm is the perfect mechanism for utilizing new instructions, but if even the assembler doesn’t know about them, how can you get your inline asm translated? The answer is to do some of the translation yourself. If you’ve ever used the mc_func pragma, then you will have had experience with manual encoding of instructions. Inline asm is a more straightforward way of doing this for two reasons: 1) you have easy access to your asm operands, and 2) there are bitwise operations available that assist with the encoding.
As an opening example, let’s start with an instruction that doesn’t take operands, such as isync. Note that isync is not an “unknown” instruction – it is standard in the PowerPC architecture, but it will serve as a good example on how to do encoding. If one looks in the Assembly Language Reference (link given below) one will see that the primary opcode for isync is decimal 19 (in bits 0 to 5) and the extended opcode is decimal 150 (in bits 21 through 30). The rest of the bits are don’t-cares, which we will put as zero. Calculating this on a hexadecimal capable calculator yields “0x4C00012C” as the whole 32-bit instruction. I’ll add this to the end of the two instruction sequence from my previous blog entry:
asm ("addc %0, %2, %3 \n"
"adde %1, %4, %5 \n"
“.long 0x4C00012C \n”
: "=&r"(xl),"=r"(xu)
: "r"(yl),"r"(zl),"r"(yu),"r"(zu));
This may seem a little strange – using a “.long” pseudo-op in the middle of some text, but it is perfectly acceptable. We are not using .long here to define data (which wouldn’t be supported), but merely to encode an instruction. If we disassembled the inline asm after it had been processed, it would look something like this:
addc 0,3,4
adde 3,5,6
isync
Which specific registers are chosen is up to the compiler. This is a significant difference from the mc_func pragma, which forces the user to use argument registers r3, r4, etc, and is restricted to a single result, returned in r3. This code snippet has two results (r0 and r3) which is already outside the capabilities of mc_func.
The biggest complication in putting together the above inline asm was coming up with the correct eight digit hexadecimal number. In fact, it isn’t really necessary to do that manually if one uses the bitwise operations made available by the system assembler: the “|” bitwise or and “<” bitwise shift operations (note: on Linux the shift operator is “<<”). In the case of isync, putting decimal 19 into bits 0-5 and decimal 150 into bits 21-30 can be done directly as such:
".long 19<26 | 150<1 \n"
The final piece of the puzzle for manual encoding is using operands. The isync was relatively easy to encode as it doesn’t take operands, but what if we wanted to use manual encoding for, say, the adde instruction in our asm. Again – adde is not an “unknown” instruction, but serves well as an example. Looking at the Assembly Language Reference gives us the basic layout of the instruction. There are three fields in the instruction that are to be filled in with register numbers. For inline asm, it is (typically) the compiler that chooses registers, but we can utilize what is chosen through the “%n” specifiers. We have been using these all along for standard asm instructions, but we can use them for manually encoded instructions as well. Just as in a printf statement, %0, %1, etc will be replaced with the correct text – in our case a register number. Doing this with adde in the above asm snippet yields the following correct asm:
asm ("addc %0, %2, %3 \n"
".long 31<26 | (%1)<21 | (%4)<16 | (%5)<11 | 138<1 \n”
“.long 0x4C00012C \n”
: "=&r"(xl),"=r"(xu)
: "r"(yl),"r"(zl),"r"(yu),"r"(zu));
Here we have one regular instruction (addc) and two manually encoded instructions: adde and isync. Notice how we’ve used %1, %4 and %5 exactly as we did before, for the “RT,” “RA,” and “RB” register operands to adde. TIP: carefully check the documentation regarding the placement of the register operands in the instruction – personal experience has taught me that sometimes there are surprises. Of course, if you are encoding a new or novel instruction, you’ll hopefully have some documentation to consult :-)
Till next time….Bill
http://pic.dhe.ibm.com/infocenter/aix/v7r1/topic/com.ibm.aix.aixassem/doc/alangref/overview.htm