Originally posted by: sumi_cj
Authors: Jin song Ji, Jing Chen
With the development of hardware technologies, more and more platforms start to support 64-bit mode. If you want to learn about a new platform, you'd better start from the ABI and assembly codes of the platform. Let's use an example program and its disassemble codes to introduce an important concept in PowerPC 64 ELF ABI, function descriptors.
//funcaddr.c
int func(){
return 2011;
}
int main(void){
func();
}
We can get its disassemble codes using the following commands:
xlc -q64 funcaddr.c -o funcaddr
objdump -dr funcaddr > funcaddr.dis
Here are the generated disassemble codes:
0000000010000680 <.func>:
10000680: 38 60 00 64 li r3,100
10000684: 4e 80 00 20 blr
10000688: 00 00 00 00 .long 0x0
1000068c: 00 00 20 40 .long 0x2040
10000690: 00 00 00 01 .long 0x1
10000694: 00 00 00 08 .long 0x8
10000698: 00 04 66 75 .long 0x46675
1000069c: 6e 63 00 00 xoris r3,r19,0
After reading the disassemble codes, you might wonder why the name of func has been added a dot(.) prefix. Let's check the symbol table:
objdump -t funcaddr|grep func
You can get the following output:
funcaddr: file format elf64-powerpc
0000000000000000 l df *ABS* 0000000000000000 funcaddr.c
0000000010010c88 g F .opd 0000000000000020 func
In the symbol table, there is no dot for the name of func. The address of func indicated in the symbol table is 0000000010010c88, while the address of .func in the disassemble codes is 0000000010000680. What causes the difference and what is the relationship of func and .func? Furthermore, func is placed in the .opd segment of the symbol table. What's the usage of the .opd segment?
With these questions in mind, let's consult PowerPC 64 ELF ABI. We can find the following introductions of function descriptors in the ABI.
PPC64 ABI
3.2.5 Function Descriptors
A function descriptor is a three doubleword data structure that contains the following values:
* The first doubleword contains the address of the entry point of the function.
* The second doubleword contains the TOC base address for the function.
* The third doubleword contains the environment pointer for languages such as Pascal and PL/1.
For an externally visible function, the value of the symbol with the same name as the function is the address of the function descriptor.
Symbol names with a dot (.) prefix are reserved for holding entry point addresses.
The value of a symbol named ".FN" is the entry point of the function "FN".
The value of a function pointer in a language like C is the address of the function descriptor.
The above questions are clarified in the ABI. Let's sum them up.
1. dot(.) is to distinguish the real function symbol. .func is used to hold the entry point address of the function descriptor.
2. The value of a function pointer is the address of the corresponding function descriptor, so the address of func in the symbol table is actually the address of the function descriptor.
3. Function descriptors are saved in the .opd segment. We can find the entry address of func in the .opd segment.
Let's look at the .opd segment by using command readelf -x .opd funcaddr.
Hex dump of section '.opd':
0x10010bf8 00000000 100004b8 00000000 10018d00 ................
0x10010c08 00000000 00000000 00000000 100004f4 ................
0x10010c18 00000000 10018d00 00000000 00000000 ................
0x10010c28 00000000 10000430 00000000 10018d00 .......0........
0x10010c38 00000000 00000000 00000000 10000880 ................
0x10010c48 00000000 10018d00 00000000 00000000 ................
0x10010c58 00000000 10000530 00000000 10018d00 .......0........
0x10010c68 00000000 00000000 00000000 100005d0 ................
0x10010c78 00000000 10018d00 00000000 00000000 ................
0x10010c88 00000000 10000680 00000000 10018d00 ................
0x10010c98 00000000 00000000 00000000 100006a0 ................
0x10010ca8 00000000 10018d00 00000000 00000000 ................
0x10010cb8 00000000 100006e0 00000000 10018d00 ................
0x10010cc8 00000000 00000000 00000000 100006f0 ................
0x10010cd8 00000000 10018d00 00000000 00000000 ................
0x10010ce8 00000000 100007c0 00000000 10018d00 ................
0x10010cf8 00000000 00000000 ........
You can see that the function descriptor of func in the symbol table is as follows:
0x10010c88 00000000 10000680 00000000 10018d00 ................
According to ABI, the first doubleword is the entry address of the function, which is 0x10010c88 00000000 10000680. This address is same as the one we get from the disassemble codes, which is 0000000010000680 <.func>.
This topic briefly introduces function descriptors in PowerPC 64. You can refer to the ABI manual for more information.
Reference information:
1. PPC64: odd opd section, binelf, http://binelf.org/2011/11/ppc64-odd-opd-section/
2. Notes on comparision between the PPC64 ABI and the IA64 ABI http://www.gelato.unsw.edu.au/IA64wiki/PPC64ABI