Authors: @Pokala Srinivas & @Vishwanatha H.D
Requirement:
s390x (IBM-Z) machines didn't have the disassembler support to disassemble Go binaries. Debugging the binaries used to be impossible without the support of the disassemblers.
We (go s390x team) worked on this requirement and enabled the support for the same. This will help all the Go developers working on s390x architecture.
Introduction:
A disassembler is a computer program that translates binaries or machine language into assembly language. The Disassembly, the output of a disassembler, is formatted to print the instructions in a human readable format.
Advantages of disassemblers:
1) Helps developers or the programmers to analyze the binary executable (exe) or the object files that the compilers generate. Also helps in compiler optimizations.
2) Helps to recover source assembly code of a program.
3) Malware analysis etc...
Stages of generating Go disassembler output:
There are mainly 4 stages while generating a Go disassembler output.
Stage #1: Parse the s390x "Principles_of_Operation_Z_Arch.pdf" document and generate a CSV file which will have complete s390x instruction set.
Stage #2: Construct s390x opcode map in the form of map tables from the instruction set CSV file (which is generated in stage #1).
Stage #3: Parse the byte data from the Go binary file to be disassembled and decode an instruction opcode and its arguments with the help of the s390x map table.
Stage #4: Print the decoded instruction either on the console or to any redirected file. Instructions are printed in 2 forms.
a) GNU Syntax: This is the native (AT&T) Go assembly instruction syntax.
b) Go Syntax: This is the pseudo Go assembly instruction (plan9) syntax.
In the first phase of development, our Go disassembler output is formatted to print the assembly output as per GNU (AT&T) syntax. In the 2nd phase, we are working on printing the assembly output as per Go (pseudo assembly) syntax as well.
Detailed Block Diagram:

Detailed Insight about each stages:
Stage #1:
“Principles_of_Operation_Z_Arch.pdf” file is parsed to generate “s390x.csv”, an instruction set CSV file.
Repo cloning, code flow and directory structure:
- An “arch” repo i.e. ”https://go.googlesource.com/arch” is cloned and a support for “s390x” arch is defined inside it.
- A new package “s390xspec” is created inside “arch/s390x” directory to support the entire stage #1 functionality.
- A “spec.go” file inside “arch/s390x/s390xspec/” directory will contain the code to parse the pdf and generate an instruction set CSV file i.e. “s390x.csv”.
- A successful compilation of the “spec.go” file will result in “s390xspec” binary.
- “s390xspec” binary is run to parse the z-ISA pdf file and to generate “s390x.csv” file.
Commands:
- go build -o s390xspec spec.go
- ./s390xspec Principles_of_Operation_Z_Arch.pdf > s390x.csv
Each line of “s390x.csv” file contains following three fields:
An instruction opcode string, such as "ADD (64) or ADD (32) or BRANCH AND LINK“ etc…
An instruction mnemonic, such as "AG R1,D2(X2,B2)".
An instruction encoding i.e. sequence of opcode and operands encoded in respective bit positions such as "operand@bitposition", each separated by “|” character.
For eg: "47368@0|0@16|R1@24|R2@28|//@32"
“s390x.csv” file contents:
"ADD (32)","A R1,D2(X2,B2)","90@0|R1@8|X2@12|B2@16|D2@20|//@32",
"ADD (32)","AR R1,R2","26@0|R1@8|R2@12|//@16",
"ADD (64)","AG R1,D2(X2,B2)","227@0|R1@8|X2@12|B2@16|D2@20|8@40|//@48",
"BRANCH AND LINK","BAL R1,D2(X2,B2)","69@0|R1@8|X2@12|B2@16|D2@20|//@32",
"BRANCH AND LINK","BALR R1,R2","5@0|R1@8|R2@12|//@16",
"COMPARE AND SIGNAL (short BFP)","KEBR R1,R2","45832@0|0@16|R1@24|R2@28|//@32",
"COMPARE AND SWAP (32)","CS R1,R3,D2(B2)","186@0|R1@8|R3@12|B2@16|D2@20|//@32",
Flow chart Diagram:

Stage #2:
Construct s390x opcode map to form s390x map tables using the instruction set CSV file, generated from stage #1.
Code flow and directory structure:
- A new package “s390xmap” is created inside “arch/s390x” directory.
- A “map.go” file inside “arch/s390x/s390xmap/” directory will contain the code to parse “s390x.csv” file and construct an opcode map table.
- A successful compilation of the “map.go” file will result in “s390xmap” binary.
- “s390xmap” binary is run to read the “s390x.csv” file and to generate s390x opcode map table, in the form of “tables.go” file.
- “s390xasm” package is created inside “arch/s390x/” directory and “tables.go” file is placed inside the “s390xasm” package.
Commands:
- go build -o s390xmap map.go
- ./s390xmap -fmt=decoder s390x.csv > arch/s390x/s390xasm/tables.go
Note: “decoder” is a format to print decoded map tables in the form of "tables.go" file.
Each line of opcode map table in “tables.go” file contains four fields. It will have all the necessary decoding information and rules to decode a specific instruction form.
An instruction mnemonic, such as "AG R1,D2(X2,B2)".
64-bit Mask value indicating a match-rule for fetching the corresponding instruction opcode.
64-bit Opcode value corresponding to the instruction.
Contains the argument details in the same form and structure as instruction manual. It's an array of “argField” structure which indicates how to decode an argument to an instruction.
Opcode map table contents in “tables.go” file:
{ARK, 0xffff000000000000, 0xb9f8000000000000, // ADD (32) (ARK R1,R2,R3)
[7]*argField{ap_Reg_24_27, ap_Reg_28_31, ap_Reg_16_19}},
{AY, 0xff00000000ff0000, 0xe3000000005a0000, // ADD (32) (AY R1,D2(X2,B2))
[7]*argField{ap_Reg_8_11, ap_DispSigned20_20_39, ap_IndexReg_12_15, ap_BaseReg_16_19}},
{AG, 0xff00000000ff0000, 0xe300000000080000, // ADD (64) (AG R1,D2(X2,B2))
[7]*argField{ap_Reg_8_11, ap_DispSigned20_20_39, ap_IndexReg_12_15, ap_BaseReg_16_19}},
{KMA, 0xffff000000000000, 0xb929000000000000, // CIPHER MESSAGE WITH AUTHENTICATION (KMA R1,R3,R2)
[7]*argField{ap_Reg_24_27, ap_Reg_16_19, ap_Reg_28_31}},
{KMC, 0xffff000000000000, 0xb92f000000000000, // CIPHER MESSAGE WITH CHAINING (KMC R1,R2)
[7]*argField{ap_Reg_24_27, ap_Reg_28_31}},
Along with opcode map tables, “tables.go” file also contains:
- A structure containing the enum constant definitions of all the supported instructions.
const (
_ Op = iota
A
AR
ARK
…. }
- A structure containing the opcode strings of all the supported instructions.
var opstr = [...]string{
A: "a",
AR: "ar",
ARK: "ark",
……. }
- A structure containing characteristics of an operand such as Type, Bit-field position and size.
var (
ap_Reg_8_11 = &argField{Type: TypeReg, flags: 0x1, BitField: BitField{8, 4}}
ap_DispUnsigned_20_31 = &argField{Type: TypeDispUnsigned, flags: 0x10, BitField: BitField{20, 12}}
ap_IndexReg_12_15 = &argField{Type: TypeIndexReg, flags: 0x41, BitField: BitField{12, 4}}
…. }
Flow Chart Diagram:

Stage #3:
Parse the byte data from the Go binary file to be disassembled. Decode an instruction opcode and its arguments with the help of the s390x map table. Length of an instruction on s390x can either be a 2 or 4 or 6 bytes.
Code flow and directory structure:
- A new package "s390xasm" is created inside “arch/s390x/" directory.
- Following files inside "s390xasm" package will help to support the entire stage #3 functionalities.
- “decode.go” file is responsible for decoding the leading bytes from input go binary source as a single instruction.
- “field.go” file is responsible for defining “BitField” structure and its related operations. This helps in extracting individual bit field and its offsets from 64-bit double word.
- “inst.go” file is responsible for defining an instruction format and its related operations. Each instruction information is contained in “Inst” structure which has opcode information, 64-bit raw encoding bits, a length of encoding bits and instruction arguments defined as per the z-ISA pdf document.
- “gnu.go” file is responsible for printing the disassembled output either in the form of GNU (AT&T) syntax or Go pseudo assembly syntax. “GNUSyntax()” function inside this file takes “Inst” structure and “pc (a program counter)" as an input argument and it will return a complete disassembled instruction string. This file is also responsible for handling all the extended mnemonic cases for many instructions as per z-ISA pdf document.
Flow Chart Diagram:

Stage #4:
Print the disassembled output based on the instruction format, either on the console or to any redirected file.
Code flow and directory structure:
- s390x arch specific support is added to "disasm.go" file inside “src/cmd/internal/objfile” directory to handle the disassembler command to get the disassembled output.
- “go tool objdump –gnu <go_binary>” is command executed to disassemble the go binary. "--gnu" is the option used to print the disassembler output as per GNU (AT&T) syntax.
- This command will invoke a “Decode( )” function in “decode.go” file, by passing the binary source to be disassembled.
- Successful completion of the “Decode( )” function will result in returning of the matching instruction along with its arguments in the form of “Inst” structure.
“go tool objdump –gnu <go_binary>” command output:
TEXT internal/abi.(*RegArgs).Dump(SB) /root/go_compiler/dev/go_disasm1/src/internal/abi/abi.go
abi.go:47 0x11000 e330d0100004 lg %r3, 16(%r0,%r13)
abi.go:47 0x11006 ec3f0069a065 clgrjnl %r3, %r15, 110d8
abi.go:47 0x1100c e3e0ffe8ff24 stg %r14, -24(%r0,%r15)
abi.go:47 0x11012 e3ff0fe8ff71 lay %r15, -24(%r15,%r0)
abi.go:47 0x11018 e3e0f0000024 stg %r14, 0(%r0,%r15)
abi.go:48 0x1101e c0e500025eb9 brasl %r14, 0x5cd90
Flow Chart Diagram:
